CRISPR Oligonucleotides and Gene Editing

ABSTRACT

The present disclosure generally relates to compositions and methods for the genetic modification of cells. In particular, the disclosure relates to CRISPR reagents and the use of such reagents.

PRIORITY

This application is a divisional of U.S. patent application Ser. No.14/879,872, filed Oct. 9, 2015, which application claims the benefit ofpriority to U.S. Provisional Application No. 62/061,961, filed Oct. 9,2014, U.S. Provisional Application No. 62/101,787, filed Jan. 9, 2015and U.S. Provisional Application No. 62/218,826 filed Sep. 15, 2015,whose disclosures are incorporated by reference in their entirety.

SEQUENCE LISTING

The instant application contains a Sequence Listing which has beensubmitted electronically in ASCII format and is hereby incorporated byreference in its entirety. Said ASCII copy, created on Oct. 7, 2015, isnamed LT00948_SL.txt and is 98,513 bytes in size.

FIELD

The present disclosure generally relates to compositions and methods forthe genetic modification of cells. In particular, the disclosure relatesto CRISPR reagents and the use of such reagents.

BACKGROUND

A number of genome-editing systems, such as designer zinc fingers,transcription activator-like effectors (TALEs), CRISPRs, and homingmeganucleases, have been developed. One issue with these systems is thatthey require a both the identification of target sites for modificationand the designing of a reagents specific for those sites, which is oftenlaborious and time consuming. In one aspect, the invention allows forthe efficient design, preparation, and use of genome editing reagents.

SUMMARY

The present disclosure relates, in part, to compositions and methods forediting of nucleic acid molecules. There exists a substantial need forefficient systems and techniques for modifying genomes. This inventionaddresses this need and provides related advantages.

CRISPR systems do not require the generation of customized proteins totarget specific sequences but rather a single Cas enzyme that can bedirected to a target nucleotide sequence (a target locus) by a short RNAmolecule with sequence complementarity to the target.

The present disclosure is directed, in part, to CRISPR editing systemmodifications that increase the usefulness of these systems. One problemassociated with gene editing systems is the amount of time and laborrequired to design and produce target locus specific gene editingreagents. The invention provides, in part, compositions and methods forthe efficient, cost-effective production of CRISPR components.

In some specific aspects, the invention is directed to three types ofsequence specific nucleic acid binding activities. Using the Cas9proteins as an example, these three systems include those where Cas9proteins are employed with (1) double-stranded cutting activity (e.g.,one Cas9 protein gene editing systems), (2) nickase activity (e.g., twoCas9 protein gene editing systems, referred to as “dual nickase”systems), and (3) no cutting activity but with the retention of nucleicacid binding activity (e.g., “dead” Cas9, referred to as dCas9, useful,for example, for gene repression, gene activation, DNA methylation,etc.).

In some aspects, the invention provides methods for producing nucleicacid molecules, including methods comprising performing polymerase chainreactions (PCR) in reaction mixtures containing (i) a double-strandednucleic acid segment and (ii) at least one oligonucleotide capable ofhybridizing to nucleic acid at one terminus of the double-strandednucleic acid segment, wherein the nucleic acid molecule is produced bythe PCR reaction, and wherein the product nucleic acid molecule containsat or near one terminus a promoter suitable for in vitro transcription.In some instances, nucleic acid molecules produced by PCR reactionencode RNA molecules of lengths from about 20 to about 300 (e.g., fromabout 20 to about 250, from about 20 to about 200, from about 35 toabout 150, from about 70 to about 150, from about 40 to about 200, fromabout 50 to about 200, from about 60 to about 200, from about 60 toabout 125, etc.) nucleotides.

RNA molecules generated by methods of the invention (e.g., ligation) orencoded by nucleic acid molecules produced by methods of the inventionmay contain a region (e.g., from about 10 to about 50, from about 20 toabout 50, from about 30 to about 50, from about 15 to about 40, fromabout 15 to about 30, etc. nucleotides) of sequence complementarity to atarget locus. Such RNA molecules may also form one or more (e.g., two,three, four, five, etc.) hairpin turn under physiological conditions(e.g., 37° C., 10 mM Tris-HCl, pH 7.0, 0.9% sodium chloride). Further,such RNA molecules may be a CRISPR RNA such as a guide RNA molecule.

In additional aspects, the invention includes methods for producingnucleic acid molecules, these methods comprising performing polymerasechain reactions (PCR) in reaction mixtures comprising (i) adouble-stranded nucleic acid segment comprising a first terminus and asecond terminus, (ii) a first oligonucleotide comprising a firstterminus and a second terminus, wherein the second terminus of the firstoligonucleotide is capable of hybridizing to the first terminus of thedouble-stranded nucleic acid segment, and (iii) a second oligonucleotidecomprising a first terminus and a second terminus, wherein the secondterminus of the second oligonucleotide is capable of hybridizing to thefirst terminus of the first oligonucleotide, to produce the nucleic acidmolecule. In some instances, the product nucleic acid molecule willcontains one or more (e.g., one, two, three, etc.) promoter suitable forin vitro transcription at or near one terminus. Also, in some instances,the product nucleic acid molecule will encode one or more CRISPR RNA(e.g., a crRNA molecule, a tracrRNA molecule, a guide RNA molecule,etc.). In some instances, reaction mixtures further comprises a firstprimer and a second primer, wherein the first primer is capable ofhybridizing at or near the first terminus of the second oligonucleotideand the second primer is capable of hybridizing at or near the secondterminus of the double-stranded nucleic acid segment.

The invention also includes methods for producing nucleic acidmolecules, the methods comprising performing polymerase chain reactionsin reaction mixtures containing (i) a first double-stranded nucleic acidsegment comprising a first terminus and a second terminus, (ii) a seconddouble-stranded nucleic acid segment comprising a first terminus and asecond terminus, and (iii) at least one oligonucleotide comprising afirst terminus and a second terminus, wherein the first terminus of theoligonucleotide is capable of hybridizing to nucleic acid at the firstterminus of the first double-stranded nucleic acid segment to producethe nucleic acid molecule, and wherein the second terminus of theoligonucleotide is capable of hybridizing to nucleic acid at the secondterminus of the second double-stranded nucleic acid segment to producethe nucleic acid molecule. In some instances, the product nucleic acidmolecule will contain one or more promoter suitable for in vitrotranscription at or near one terminus.

The invention further includes methods for producing nucleic acidmolecules, these method comprising performing polymerase chain reactionsin reaction mixtures containing (i) a first double-stranded nucleic acidsegment comprising a first terminus and a second terminus, (ii) a seconddouble-stranded nucleic acid segment comprising a first terminus and asecond terminus, (iii) a first oligonucleotide comprising a firstterminus and a second terminus, and (iv) a second oligonucleotidecomprising a first terminus and a second terminus, wherein the secondterminus of the first oligonucleotide is capable of hybridizing tonucleic acid at the first terminus of the second double-stranded nucleicacid segment, wherein the second terminus of the second oligonucleotideis capable of hybridizing to the first terminus of the firstoligonucleotide, wherein the second terminus of the secondoligonucleotide is capable of hybridizing to the first terminus of thesecond double-stranded nucleic acid segment. In some instances, theproduct nucleic acid molecules contain one or more promoter suitable forin vitro transcription at or near (e.g., within 10 base pairs) oneterminus.

The invention also includes methods for producing CRISPR RNA molecules,these methods comprise contacting two or more linear RNA segments witheach other under conditions that allow for the 5′ terminus of a firstRNA segment to be covalently linked with the 3′ terminus of a second RNAsegment to form the CRISPR RNA. In some instances, the CRISPR RNAmolecules are separated from reaction mixture components (e.g., bycolumn chromatography, such as by high-performance liquidchromatography).

The invention additionally includes methods for producing a guide RNAmolecules, these method comprise: (a) separately producing a crRNAmolecule and a tracrRNA molecule and (b) contacting the crRNA moleculeand the tracrRNA molecule with each other under conditions that allowfor the covalently linking of the 3′ terminus of the crRNA to the 5′terminus of the tracrRNA to produce the guide RNA molecule. Guide RNAmolecules may have a region of sequence complementarity of at least 10(e.g., from about 10 to about 50, from about 10 to about 40, from about10 to about 35, from about 10 to about 30, from about 10 to about 25,from about 15 to about 25, from about 17 to about 22, etc.) nucleotidesto a target locus. In many instances, the target locus is a naturallyoccurring chromosomal locus in a eukaryotic cell.

The invention also includes compositions comprising two RNA moleculesconnected by a triazole group, wherein one of the RNA molecules has aregion of sequence complementarity of at least 10 nucleotides to atarget locus.

In some aspects, the invention is directed to methods for gene editingat a target locus within a cell, these methods comprise introducing intothe cell at least one CRISPR protein and at least one CRISPR RNA,wherein the at least one CRISPR RNA has a region of sequencecomplementarity of at least 10 base pairs to the target locus. In someinstances, a linear DNA segment that has sequence homology at bothtermini to the target locus is also introduced into the cell. In someinstances, one of the at least one CRISPR proteins is a Cas9 protein.This Cas9 protein may have the ability to make a double-stranded cut inDNA or to nick double-stranded DNA. In some instances, two Cas9 proteinsare introduced into the cells, where one Cas9 protein has a mutationthat renders to HNH domain inactive and the other Cas9 protein has amutation that renders to RuvC domain rendering that domain inactive. Insome instances, two RNA molecules (e.g., CRISPR RNA molecules), eachwith sequence complementarity to different target sequences, areintroduced into the cell. Further, these different target sequences maybe located within forty (e.g., from about 2 to about 40, from about 2 toabout 25, from about 2 to about 20, from about 2 to about 15, from about2 to about 10, from about 2 to about 8, from about 4 to about 20, fromabout 4 to about 15, from about 4 to about 10, from about 6 to about 20,etc.) base pairs of each other. Distances between sequences may bemeasured in reference to the double-stranded cut or nick site. In suchinstances, target sequences may overlap.

The invention further includes cells containing one or more CRISPRsystem components and cells made by methods set out herein. For example,the invention includes cells into which CRISPR complexes have beenintroduced (e.g., cells that contain (1) plasmids encoding Cas9 andguide RNA, (2) Cas9 mRNA and guide RNA, etc.). The invention furtherincludes cells containing mRNA encoding dCas9 and fusion proteinsthereof, as well as cells that have been modified by methods of theinvention (e.g., cells that have undergone cleavage and relegation ofcellular DNA with and without inserts at the cleavage site) that eithercontain or no longer contain one or more CRISPR system component.

BRIEF DESCRIPTION OF THE DRAWINGS

For a more complete understanding of the principles disclosed herein,and the advantages thereof, reference is now made to the followingdescriptions taken in conjunction with the accompanying drawings, inwhich:

FIG. 1 is a representative diagram of a naturally occurring CRISPRsystem. In addition to the “Target DNA”, three additional components arerequired: Cas9 protein (shaded rectangle), crRNA (CRISPR RNA), andtracrRNA (trans-activating crRNA). The arrows labeled “RuvC and “HNH”indicate cutting locations in the Target DNA. The dashed box labeled“PAM” refers to protospacer adjacent motif.

FIG. 2 shows the association between a crRNA molecule (SEQ ID NO: 56)and a tracrRNA molecule (SEQ ID NO: 57). Hybridization Region 1 (19nucleotides, in this instance) is complementary to the target site.Hybridization Region 2 is a region of sequence complementarity betweenthe crRNA (41 nucleotides) and the tracrRNA (85 nucleotides). ThetracrRNA 3′ region is the 3′ region of the tracrRNA molecule thatextends beyond Hybridization Region 2. The Loop Replaceable Region isroughly defined by the closed box and may be replaced with a hairpinloop to connect crRNA and tracrRNA molecules into a single entity,typically referred to as a guide RNA (see FIG. 3).

FIG. 3 is a schematic of a guide RNA molecule (104 nucleotides) showingthe guide RNA bound to both Cas9 protein and a target genomic locus.Hairpin Region 1 is formed by the hybridization of complementary crRNAand tracrRNA regions joined by the nucleotides GAAA. Hairpin Region 2 isformed by a complementary region in the 3′ portion of the tracrRNA. FIG.3 discloses SEQ ID NOS 58-60, respectively, in order of appearance.

FIG. 4 is a schematic showing a nicking based nucleic acid cleavagestrategy using a CRISPR system. In the top portion of the figure, twolines represent double-stranded nucleic acid. Two nick sites areindicated by Site 1 and Site 2. These sites are located within a solidor dashed box indicating the region of the nucleic acid that interactswith the CRISPR/Cas9 complex. The lower portion of the figure shownicking actions that result in two closely positioned nicks in bothstrands.

FIG. 5 is a schematic showing some methodologies for transient CRISPRactivity within cells. The introduction of Cas9 proteins and/or nucleicacid encoding Cas9 is shown on the left. The introduction of guide RNA,crRNA plus tracrRNA, or crRNA alone is shown on the right. The middleshows the introduction of linear DNA encoding Cas9 or Cas9 plustracrRNA. This DNA is designed to be stably maintained in the cell.

FIG. 6 shows a workflow for synthesizing guide RNA using DNA oligotemplates. Guide RNA encoding DNA template is generated using assemblyPCR. Components of this assembly reaction include 1) a target specificDNA oligo (encodes the crRNA region), 2) DNA oligo specific to thebacterial promoter used for in vitro transcription (in this case T7promoter), and 3) overlapping PCR products encoding tracrRNA region. Afill in reaction followed by PCR amplification is performed in a Thermocycler using DNA polymerase enzyme (in this case high fidelity PHUSION®Taq DNA polymerase) to generate full length gRNA encoding templates.Following PCR assembly the resulting DNA template is transcribed at 37°C. to generate target specific gRNA using in vitro transcriptionreagents for non-coding RNA synthesis (in this case MEGASHORTSCRIPT™ T7kit). Following synthesis the resulting gRNA is purified using a columnor magnetic bead based method. Purified in vitro transcribed guide RNAis ready for co-transfection with Cas9 protein or mRNA delivery in ahost system or cell line of interest. FIG. 6 guide RNA disclosed as SEQID NO: 58.

FIG. 7 shows overlapping DNA oligos as template for gRNA synthesis. TheT7 promoter sequence and the overlap region are each about 20nucleotides in length. Further, the box labeled “20 bp crRNA” is thetarget recognition component of the crRNA. Guide RNA is synthesizedusing 2 overlapping DNA oligonucleotides. 1) The forward DNA oligocontains the T7 promoter region (or other relevant in vitrotranscription promoter), followed by target specific crRNA encodingregion and a region that overlaps with the reverse oligonucleotide 2)Reverse DNA oligo encodes a tracrRNA sequence that is the constantcomponent. These two overlapping oligonucleotides are annealed andextended to generate a DNA template for gRNA in vitro transcription(IVT) using high fidelity DNA polymerase enzyme (example PHUSION® TaqDNA polymerase). The assembly reaction also includes a 2-3 PCR cyclingcondition to enrich for the full length templates. The assembled DNAtemplate is then used to generate guide RNA at 37° C. using in vitrotranscription reagents for non-coding RNA synthesis (in this caseMEGASHORTSCRIPT™ T7 kit was used). Following gRNA synthesis the productis purified using a column or, alternatively, using bead basedpurification methods.

FIG. 8 shows a PCR assembly based method for producing DNA moleculesthat encode guide RNA molecules. In this schematic, “Oligo 1” encodes aT7 promoter and part of Hybridization Region 1 and “Oligo 2” encode partof Hybridization Region 1 and has overlapping sequence with the “3′ PCRSegment”. PCR is then used for assembly of these overlapping fragments,followed by amplification using the 5′ and 3′primers, resulting in adouble-stranded DNA molecule containing a T7 promoter operably connectedto a target specific guide RNA coding sequence. RNA may be produced fromthis double-stranded DNA molecule by in vitro transcription. FIG. 8guide RNA disclosed as SEQ ID NO: 58.

FIG. 9 shows PCR assembly method for synthesizing guide RNA expressingtemplates by PCR assembly. This method can be used to introduce otherpromoters and terminators in the context of the guide RNA. In thisschematic, the overlap region between “First Oligo” and “Second Oligo”encode “Hybridization Region 1”. The ˜ in the RNA polymerase IIIpromoter region represents an unrepresented segment of the nucleic acidmolecule because these promoters can be several hundred bases in length.The RNA polymerase III terminator sequence is not shown in this figure.The 5′ primer and 3′ primer sequences extend beyond termini the nucleicacid segments that they hybridize to indicate that primers may be usedto add additional functionalities to the amplified nucleic acidmolecules.

FIG. 10 shows a collection of variable crRNA molecules and a constanttracrRNA molecule. A specific crRNA molecule (crRNA3 in this instance)may be selected and then linked to a tracrRNA molecule.

FIG. 11 shows an exemplary method for linking two RNA segments. Thelinking reaction shown in this figure using propargyl on one terminusand azide on the other terminus is unidirectional in that the terminiwith the chemical modifications are the only one that can link with eachother.

FIGS. 12A-12F shows an alignment of Cas9 proteins of from fiveStreptococcus species (SEQ ID NOS 61-63, 1 and 64, respectively, inorder of appearance). Identical amino acids are shown as whitecharacters on a black background. Conservative amino acid alterationsare shown as black letters on a gray background.

FIG. 13 shows oligonucleotide designs for a one-step synthesis of gRNAtemplate workflow. Sequence validated PCR fragment refers to a PCRfragment of a sequence validated plasmid.

FIG. 14 shows data from in vivo genome cleavage and detection assays.Gel Image A: Original cleavage assay with gRNA amplified from a plasmidversus gRNA assembled from 6 overlapping oligos. Less than 50% cleavageactivity compared to plasmid is seen. This is due to incorrect assembly.Gel Image B: New assembly method as outlined in FIG. 13, with either a20 bp overlap or 15 bp overlap. An equivalent cleavage activity comparedto the plasmid control is seen.

FIG. 15 shows data derived from synthetic gRNA templates that werecloned into a ZERO BLUNT® TOPO vector. Ninety-six colonies were randomlypicked for sequencing analysis. The percentage of incorrect clones wascalculated.

FIG. 16 shows data showing the effect of deletions of G's from the 3′terminus of a T7 promoter on in vitro transcription.

FIG. 17 shows an “all-in-one” vector containing a CD4 coding region. Thenucleotide sequence for the vector is set out in Table 9.

FIG. 18 shows an “all-in-one” vector containing an orange fluorescentprotein (OFP) coding region. The nucleotide sequence for the vector isset out in Table 10.

FIG. 19. Cell engineering workflow. On day 1, the researcher designsCRISPR targets and seeds cells. Synthesis of gRNA and cell transfectionwith Cas9 protein/gRNA complex (Cas9 RNP) are performed on day 2. Genomecleavage assays carried out on days 3-4. FIG. 19 discloses SEQ ID NO:65.

FIGS. 20A-20D. Design and synthesis of gRNA. (FIG. 20A) Design ofoligonucleotide pool. The pool consists of one 80 nucleotide tracerRNAPCR fragment, two end primers, and two 34 bp oligonucleotides with 15 bpoverlap. (FIG. 20B) One-step synthesis of gRNA template. Four DNAoligonucleotides and one PCR fragment were assembled in a single tubeand the PCR product was analyzed by agarose gel electrophoresis (Lanes 2and 3). A gRNA template prepared from all-in-one plasmid served ascontrol (Lane 1). (FIG. 20C) In vitro transcription. Aliquots of PCRproduct (Lanes 2 and 3) along with the control (Lane 1) were subjectedto in vitro transcription. The resulting product was analyzed bydenaturing gel. (FIG. 20D: Error rates in synthetic gRNA templates) TheDNA template of gRNA was synthesized using the standard gene synthesisapproach with a set of short oligonucleotides (GS). Alternatively, theoligonucleotide pool described above was used for PCR assembly. Twostandard desalted end primers (15 bp) and HPLC or PAGE-purified endprimers (15 bp*) were tested in assembly PCR reaction. The syntheticgRNA templates as well as control gRNA template from the ‘all-in-one’plasmid (plasmid) were cloned into a TOPO vector. For each individualtemplate, 96 colonies were randomly picked for sequencing.

FIGS. 21A-21D. Lipid-mediated transfection. (FIG. 21A: DNA vs. mRNA vs.Protein) Three separate genomic loci (HPRT, AAVS or RelA) were editedvia Cas9 plasmid DNA, mRNA or protein transfection of HEK293FT cells.For the HPRT target, transfection was performed in the presence orabsence of serum. The efficiency of genome modification was determinedby Genomic Cleavage assay. (FIG. 21B: Time course of editing) HEK293FTcells were transfected with either plasmid DNA, Cas9 mRNA/gRNA or Cas9RNPs directed to the HPRT loci. Cell samples were taken at differenttime points and analyzed by genomic cleavage assays. (FIG. 21C) WesternBlot analysis of samples taken at different time points. (FIG. 21D)Off-target mutation of VEGFA T3 target caused by Cas9 plasmid DNA, mRNAor protein transfection. Percentages of on-target mutation as well asOT3-2 and OT3-18 off-target mutations were determined by DNA sequencing.

FIGS. 22A-22B. Electroporation-mediated transfection. (FIG. 22A:Electroporation-mediated transfection) Mastermixes of plasmid DNA, Cas9mRNA/gRNA or Cas9 protein/gRNA were used to electroporate Jurkat T cellsusing the Neon 24 optimized protocol, which varies in pulse voltage,pulse width and number of pulses. The percentage of locus-specificgenome cleavage was estimated 48-hour post transfection using a genomiccleavage assay. The letter “P” has been positioned at the top of eachprotein lane for ease of data review. The two bars to the right of each“P” are DNA and mRNA, respectively. (FIG. 22B: Electroporation-mediatedtransfection) Dose-dependent effect of genome editing. While keeping theratio of Cas9 protein/gRNA constant, different amounts of Cas9 RNPs wereused for electroporation using protocol 5. Experiments were done intriplicate. The percentage of cleavage was confirmed by sequencing.

FIGS. 23A-23D. Multiple gene editing in the human genome. Jurkat T cellswere cotransfected with either a Cas9 plasmid pool, a Cas9 mRNA/gRNApool or Cas9 RNP complexes targeting AAVS1 and HPRT targets (FIG. 23A)or AAVS, RelA and HPRT gene targets (FIG. 23C). Genomic cleavage assayswere performed for each locus at 48 hours post transfection. Cellaliquots were then subjected to clonal isolation by serial dilution.After clonal expansion, each locus was PCR-amplified from each clonalcell line. The PCR product was then cloned into a plasmid vector and thepercentage of indel mutation was determined by sequencing of eightindividual E. coli colonies. Quantitation of double mutants for AAVS 1and HPRT was based on 16 clonal cell lines (FIG. 23B: Efficiency ofdouble mutant production), whereas quantitation of triple mutants ofAAVS, RelA and HPRT was based on a total of 53 clonal cell lines derivedfrom three independent experiments (FIG. 23D: Efficiency of triplemutant production).

FIG. 24 shows a workflow for sequential delivery of CRISPR componentsand donor DNA into HEK293 cells and cell enrichment wherein donor DNAwas labeled with Alexa647 dye and Cas9 was fused to GFP.

FIG. 25 shows a workflow for sequential delivery of CRISPR componentsand donor DNA and cell enrichment. In this work flow, cells aresubjected to electroporation twice with donor DNA or Cas9 RNP introducedinto the cells with each electroporation.

FIG. 26 shows data derived from two series of experiments using HEK293cells involving either co-delivery of CRISPR system components and donorDNA or sequential delivery of CRISPR system components and donor DNA. Inbrief, 2 μg of Cas9 protein and 500 ng of the corresponding T1, T2and/or T8 gRNA were added to Suspension Buffer R (Thermo FisherScientific, DPBS, cat no. #14287) to prepare the Cas9 RNP complexes. Forco-delivery, 1 μl of 50 μM donor DNA with either blunt end (B) or 5′protrusion (50) was added to the 10 μl reaction at this point.Alternatively, 1 μg of single strand (ss) DNA oligonucleotide and 500 ngdouble stranded DNA fragment was added. The mixture was thenelectroporated into a cell line having a disrupted GFP coding sequenceusing 1150 volts, 20 ms and 2 pulses. The cells were immediatelytransferred to a 24-well containing 500 μl medium, followed byincubation for 48 hours prior to flow cytometric analysis. Forsequential electroporation, Cas9 RNP was delivered first into the cells,followed by a quick wash with 500 μl Suspension Buffer R. Uponcentrifugation, the cell pellets were resuspended in 10 μl SuspensionBuffer R. After addition of the corresponding donor DNA molecule, thecells were electroporated again using the same electroporationcondition. The nucleotides sequences of nucleic acid molecules used inthese experiment series are set out in Table 12.

Oligonucleotides were designed in a manner to correct an alteration innucleic acid encoding GFP resulting in the generation of fluorescenceupon correction. Thus, homologous recombination corrects the alterationresulting in expression active GFP. “% of GFP+ cells” refers to thepercentage of cells that were found to contain functionally active GFP.The same assay was used to score homologous recombination in a number ofadditional experiments set out herein.

FIG. 27 shows data for the effect of the amount of oligonucleotide onhomologous recombination. Under the conditions tested, the optimalamount of oligonucleotide is between 0.2 to 0.5 μg of single-strandeddonor DNA in 10 μl of Suspension Buffer R.

FIG. 28 shows data for the effect of oligonucleotide length andphosphorothioate modifications on homologous recombination. The amountof oligonucleotide for the equal mass experiments was 0.33 μg per 10 μlreaction. For equal molarity experiments, 10 pmoles per 10 μl reaction(1 μM final concentration) was used. “N” refers to no chemicalmodifications. “PS” refers to phosphorothioate chemical modifications atboth termini.

FIG. 29 shows a number of electroporation conditions and data resultingfrom their use. The data was generated using sequential delivery in theHEK 293 cells, first using Pstd electroporation conditions to deliverCas9 RNP and the second using the indicated conditions for delivery ofdonor DNA. The data for Pstd with 0.2 μg of antisense donor DNA showsabout a 147-fold induction of homologous recombination over the HRbackground induced by Cas9/donor without gRNA and about 47-foldinduction over Cas9 RNP background. The data for Pstd with 0.5 μg ofantisense donor DNA shows about a 126-fold induction of homologousrecombination over Cas9/donor background and about 40-fold inductionover Cas9 RNP background.

FIG. 30 shows the results of an experiment to determine gRNA/Cas9complex stability. 50 μg of gRNA was combined with 150 μg of Cas9protein, left at room temperature for 5 minutes, and then samples werestored either at 4° C. or frozen at −20° C. for the following lengths oftime: 1 week (A), 2 weeks (B), 1 month (C), 2 months (D), 3 months (E),or 6 months (F). After the designated length of time, the samples werethen screened using 293FT cells for cleavage activity using GENEART®Genomic Cleavage Detection Kits (Thermo Fisher Scientific, Cat. No.A24372). Cleavage activity was compared to freshly prepared gRNA/Cas9complexes and relative activity was calculated with 1 being the sameactivity for both the stored sample and the freshly prepared sample. Theerrors bars indicated one standard deviation.

FIG. 31 shows the results of an experiment to determine gRNA/Cas9complex stability and OPTI-MEM® culture medium (Thermo FisherScientific, cat. no. 31985-070) Complex preparation and storageconditions were as set out in the legend to FIG. 30 with 50 μg of gRNAwas combined with 150 μg of Cas9 protein and 10 μl of OPTI-MEM®.

FIG. 32 shows the results of an experiment to determine Cas9 protein andLIPOFECTAMINE® RNAiMax transfection reagent (Thermo Fisher Scientific,cat. no. 13778-150) stability Complex preparation and storage conditionswere as set out in the legend to FIG. 30 with 150 ng of Cas9 protein and1.5 μl of LIPOFECTAMINE® RNAiMax at stored at 4 C or −20° C. 50 ng ofgRNA mix with Opti-MEM.

DETAILED DESCRIPTION Definitions

As used herein the term “CRISPR activity” refers to an activityassociated with a CRISPR system. Examples of such activities aredouble-stranded nuclease, nickase, transcriptional activation,transcriptional repression, nucleic acid methylation, nucleic aciddemethylation, and recombinase.

As used herein the term “CRISPR system” refers to a collection of CRISPRproteins and nucleic acid that, when combined, result in at least CRISPRassociated activity (e.g., the target locus specific, double-strandedcleavage of double-stranded DNA).

As used herein the term “CRISPR complex” refers to the CRISPR proteinsand nucleic acid (e.g., RNA) that associate with each other to form anaggregate that has functional activity. An example of a CRISPR complexis a wild-type Cas9 (sometimes referred to as Csn1) protein that isbound to a guide RNA specific for a target locus.

As used herein the term “CRISPR protein” refers to a protein comprisinga nucleic acid (e.g., RNA) binding domain nucleic acid and an effectordomain (e.g., Cas9, such as Streptococcus pyogenes Cas9). The nucleicacid binding domains interact with a first nucleic acid molecules eitherhaving a region capable of hybridizing to a desired target nucleic acid(e.g., a guide RNA) or allows for the association with a second nucleicacid having a region capable of hybridizing to the desired targetnucleic acid (e.g., a crRNA). CRISPR proteins can also comprise nucleasedomains (i.e., DNase or RNase domains), additional DNA binding domains,helicase domains, protein-protein interaction domains, dimerizationdomains, as well as other domains.

CRISPR protein also refers to proteins that form a complex that bindsthe first nucleic acid molecule referred to above. Thus, one CRISPRprotein may bind to, for example, a guide RNA and another protein mayhave endonuclease activity. These are all considered to be CRISPRproteins because they function as part of a complex that performs thesame functions as a single protein such as Cas9.

In many instances, CRISPR proteins will contain nuclear localizationsignals (NLS) that allow them to be transported to the nucleus.

The amino acid sequence of a representative Cas9 protein is set outbelow in Table 1.

TABLE 1 Streptococcus pyogenes Cas9 (GenBank Accession No. WP_010922251)(SEQ ID NO: 1) 1MDKKYSIGLD IGTNSVGWAV ITDEYKVPSK KFKVLGNTDR HSIKKNLIGA LLFDSGETAE 61ATRLKRTARR RYTRRKNRIC YLQEIFSNEM AKVDDSFFHR LEESFLVEED KKHERHPIFG 121NIVDEVAYHE KYPTIYHLRK KLVDSTDKAD LRLIYLALAH MIKFRGHFLI EGDLNPDNSD 181VDKLFIQLVQ TYNQLFEENP INASGVDAKA ILSARLSKSR RLENLIAQLP GEKKNGLFGN 241LIALSLGLTP NFKSNFDLAE DAKLQLSKDT YDDDLDNLLA QIGDQYADLF LAAKNLSDAI 301LLSDILRVNT EITKAPLSAS MIKRYDEHHQ DLTLLKALVR QQLPEKYKEI FFDQSKNGYA 361GYIDGGASQE EFYKFIKPIL EKMDGTEELL VKLNREDLLR KQRTFDNGSI PHQIHLGELH 421AILRRQEDFY PFLKDNREKI EKILTFRIPY YVGPLARGNS RFAWMTRKSE ETITPWNFEE 481VVDKGASAQS FIERMTNFDK NLPNEKVLPK HSLLYEYFTV YNELTKVKYV TEGMRKPAFL 541SGEQKKAIVD LLFKTNRKVT VKQLKEDYFK KIECFDSVEI SGVEDRFNAS LGTYHDLLKI 601IKDKDFLDNE ENEDILEDIV LTLTLFEDRE MIEERLKTYA HLFDDKVMKQ LKRRRYTGWG 661RLSRKLINGI RDKQSGKTIL DFLKSDGFAN RNFMQLIHDD SLTFKEDIQK AQVSGQGDSL 721HEHIANLAGS PAIKKGILQT VKVVDELVKV MGRHKPENIV IEMARENQTT QKGQKNSRER 781MKRIEEGIKE LGSQILKEHP VENTQLQNEK LYLYYLQNGR DMYVDQELDI NRLSDYDVDH 841IVPQSFLKDD SIDNKVLTRS DKNRGKSDNV PSEEVVKKMK NYWRQLLNAK LITQRKFDNL 901TKAERGGLSE LDKAGFIKRQ LVETRQITKH VAQILDSRMN TKYDENDKLI REVKVITLKS 961KLVSDFRKDF QFYKVREINN YHHAHDAYLN AVVGTALIKK YPKLESEFVY GDYKVYDVRK 1021MIAKSEQEIG KATAKYFFYS NIMNFFKTEI TLANGEIRKR PLIETNGETG EIVWDKGRDF 1081ATVRKVLSMP QVNIVKKTEV QTGGFSKESI LPKRNSDKLI ARKKDWDPKK YGGFDSPTVA 1141YSVLVVAKVE KGKSKKLKSV KELLGITIME RSSFEKNPID FLEAKGYKEV KKDLIIKLPK 1201YSLFELENGR KRMLASAGEL QKGNELALPS KYVNFLYLAS HYEKLKGSPE DNEQKQLFVE 1261QHKHYLDEII EQISEFSKRV ILADANLDKV LSAYNKHRDK PIREQAENII HLFTLTNLGA 1321PAAFKYFDTT IDRKRYTSTK EVLDATLIHQ SITGLYETRI DLSQLGGD

As used herein, the term “transcriptional regulatory sequence” refers toa functional stretch of nucleotides contained on a nucleic acidmolecule, in any configuration or geometry, that act to regulate thetranscription of (1) one or more structural genes (e.g., two, three,four, five, seven, ten, etc.) into messenger RNA or (2) one or moregenes into untranslated RNA. Examples of transcriptional regulatorysequences include, but are not limited to, promoters, enhancers,repressors, and the like.

As used herein, the term “promoter” is an example of a transcriptionalregulatory sequence, and is specifically a nucleic acid generallydescribed as the 5′ region of a gene located proximal to the start codonor nucleic acid which encodes untranslated RNA. The transcription of anadjacent nucleic acid segment is initiated at the promoter region. Arepressible promoter's rate of transcription decreases in response to arepressing agent. An inducible promoter's rate of transcriptionincreases in response to an inducing agent. A constitutive promoter'srate of transcription is not specifically regulated, though it can varyunder the influence of general metabolic conditions.

As used herein, the terms “vector” refers to a nucleic acid molecule(e.g., DNA) that provides a useful biological or biochemical property toan insert. Examples include plasmids, phages, autonomously replicatingsequences (ARS), centromeres, and other sequences which are able toreplicate or be replicated in vitro or in a host cell, or to convey adesired nucleic acid segment to a desired location within a host cell. Avector can have one or more restriction endonuclease recognition sites(e.g., two, three, four, five, seven, ten, etc.) at which the sequencescan be cut in a determinable fashion without loss of an essentialbiological function of the vector, and into which a nucleic acidfragment can be spliced in order to bring about its replication andcloning. Vectors can further provide primer sites (e.g., for PCR),transcriptional and/or translational initiation and/or regulation sites,recombinational signals, replicons, selectable markers, etc. Clearly,methods of inserting a desired nucleic acid fragment which do notrequire the use of recombination, transpositions or restriction enzymes(such as, but not limited to, uracil N glycosylase (UDG) cloning of PCRfragments (U.S. Pat. Nos. 5,334,575 and 5,888,795, both of which areentirely incorporated herein by reference), T:A cloning, and the like)can also be applied to clone a fragment into a cloning vector to be usedaccording to the present invention. The cloning vector can furthercontain one or more selectable markers (e.g., two, three, four, five,seven, ten, etc.) suitable for use in the identification of cellstransformed with the cloning vector.

As used herein the term “nucleic acid targeting capability” refers tothe ability of a molecule or a complex of molecule to recognize and/orassociate with nucleic acid on a sequence specific basis. As an example,Hybridization Region 1 on a crRNA molecule confers nucleic acidtargeting capability upon a CRISPR complex.

As used herein the term “target locus” refers to a site within a nucleicacid molecule for CRISPR system interaction (e.g., binding andcleavage). When a single CRISPR complex is designed to cleavedouble-stranded nucleic acid, then the target locus is the cut site andthe surrounding region recognized by the CRISPR complex. When two CRISPRcomplexes are designed to nick double-stranded nucleic acid in closeproximity to create a double-stranded break, then the region surroundingand including the break point is referred to as the target locus.

A “counter selectable” marker (also referred to herein a “negativeselectable marker”) or marker gene as used herein refers to any gene orfunctional variant thereof that allows for selection of wanted vectors,clones, cells or organisms by eliminating unwanted elements. Thesemarkers are often toxic or otherwise inhibitory to replication undercertain conditions which often involve exposure to a specific substratesor shift in growth conditions. Counter selectable marker genes are oftenincorporated into genetic modification schemes in order to select forrare recombination or cloning events that require the removal of themarker or to selectively eliminate plasmids or cells from a givenpopulation. One example of a negative selectable marker system widelyused in bacterial cloning methods is the ccdA/ccdB toxin-antitoxinsystem.

Overview:

The invention relates, in part, to compositions and methods for thepreparation of nucleic acid molecules. In particular, the inventionrelates to combinations of proteins and nucleic acid molecules designedto interact with other nucleic acid molecules. More specifically, theinvention relates to protein nucleic acid complexes, where the nucleicacid component has sequence complementarity to a target nucleic acidmolecule. In these systems, sequence complementarity between thecomplexed nucleic acid and the target nucleic acid molecule is the usedto bring the complex into association with the target nucleic acid. Oncethis occurs, functional activities associated with the complex may beused to modify the target nucleic acid molecule.

The invention is exemplified by CRISPR systems. The term “CRISPR” is ageneral term that applies to three type of systems, and systemsub-types. In general, the term CRISPR refers to the repetitive regionsthat encode CRISPR system components (e.g., encoded crRNAs). Three typesof CRISPR systems (see Table 2) have been identified, each withdiffering features.

TABLE 2 CRISPR System Types Overview System Features Example Type IMultiple proteins (5-7 proteins S. epidermidis (Type IA) typical),crRNA, requires PAM. DNA Cleavage is catalyzed by Cas3. Type II 3-4proteins (one protein (Cas9) has Streptococcus pyogenes nucleaseactivity) two RNAs, CRISPR/Cas9 requires PAMs. Target DNA cleavagecatalyzed by Cas9 and RNA components. Type III Five or six proteinsrequired for S. epidermidis cutting, number of required RNAs (TypeIIIA); unknown but expected to be 1, P. furiosus PAMs not required. TypeIIIB (Type IIIB); systems have the ability to target RNA.

While the invention has numerous aspects and variations associated withit, the Type II CRISPR/Cas9 system has been chosen as a port ofreference for explanation herein.

In certain aspects, the invention provides:

-   -   1. Individual oligonucleotides to make crRNA/tracrRNAs and        collections of such oligonucleotides, as well as methods for        generating and using such oligonucleotides.    -   2. Compositions and methods for introducing CRISPR complex        components into cells.

FIG. 1 shows components and molecular interactions associated with aType II CRISPR system. In this instance, the Cas9 mediated Streptococcuspyogenes system is exemplified.

A crRNA is shown in FIG. 1 hybridizing to both target DNA (HybridizationRegion 1) and tracrRNA (Hybridization Region 2). In this system, thesetwo RNA molecules serve to bring the Cas9 protein to the target DNAsequence is a manner that allows for cutting of the target DNA. Thetarget DNA is cut at two sites, to form a double-stranded break.

There appears to be substantial sequence variation in tracrRNA sequence.It has been postulated that tracrRNA function relates more to RNAstructure, than RNA sequence.

The Cas9 protein of Streptococcus pyogenes is 1368 amino acids in length(NCBI Reference Sequence: WP_030126706.1) and contains a number ofdomains for the binding and cutting of nucleic acid molecules. Thisprotein has two domains (RuvC and HNH), each of which has DNA nickaseactivity. When this protein nicks DNA on both strands, the nicks are inclose enough proximity to result in the formation of a double-strandedbreak.

In some instances, CRISPR proteins will contain one or more of thefollowing amino acid sequences: (1) YSIGLDIGTNSVG (SEQ ID NO: 2), (2)PTIYHLR (SEQ ID NO: 3), (3) RGHFLIE (SEQ ID NO: 4), (4) TKAPLSASM (SEQID NO: 5), (5) LRKQRTFDNG (SEQ ID NO: 6), (6) LTFRIPYYVGPLAR (SEQ ID NO:7), (7) TLTLFEDREMI (SEQ ID NO: 8), (8) AGSPAIKKGILQ (SEQ ID NO: 9), (9)RQLVETRQITKHVA (SEQ ID NO: 10) and/or (10) QTGGFSKESIL (SEQ ID NO: 11).

While not wishing to be bound by theory, in brief, as shown in FIG. 1,crRNA hybridizes to target DNA, referred to as “Hybridization Region 1”.Hybridization region 1 is typically in the range of 18 to 22 base pairsbut can be longer or shorter. The crRNA thus “indentifies” the targetDNA sequence. The crRNA also hybridizes to the tracrRNA, referred to as“Hybridization Region 2”. Hybridization Region 1 is typically in therange of 15 to 25 base pairs but can be longer or shorter and oftenthere is not have full sequence complementarity between the hybridizedstrands. The tracrRNA is believed to associate with the Cas9 protein,bringing the RuvC and HNH cleavage domains in contact with the targetDNA.

A number of features of the CRISPR/Cas9 system, any or all of which maybe used in the practice of the invention, have been identified:

-   -   1. crRNA and tracrRNA may be combined to form a guide RNA        (gRNA).    -   2. Mutations may be introduced into Cas9 proteins that        inactivate either the RuvC or HNH domains resulting in proteins        with strand specific nickase activity.    -   3. Mutations may be introduced into Cas9 proteins that        inactivate all nucleic acid cleavage activities but allow for        these proteins to retain nucleic acid binding activity.    -   4. Sequence alterations, including truncations and        multi-nucleotide deletions, can be made to the CRISPR system RNA        components.

One limitation on Type II CRISPR systems is the requirement of aprotospacer adjacent motif (PAM) for high level activity. Efficientbinding and cleavage of DNA by Cas9-RNA requires recognition of a PAM.Typically, PAMs are three nucleotides in length.

In many instances, it will be desirable to make two nicks in closeproximity to each other when cleaving nucleic acid using methods of theinvention. This is especially so when the target locus is in a cellulargenome. The use of CRISPR system components that nick nucleic acid isbelieved to limit “off-target effects” in that a single nick at alocation other than the target locus is unlikely to result insingle-stranded cleavage of the nucleic acid.

FIG. 4 shows the selection of two closely associated sites that form atarget locus. Each of the sites (Site 1 and Site 2) binds a CRISPRcomplex with nickase activity.

The two sites exemplified in FIG. 4 will generally be locatedsufficiently close to each other so that the double-stranded nucleicacid containing the nick breaks. While this distance will vary withfactors such as the AT/CG content of the region, the nick sites willgenerally be within 200 base pairs of each other (e.g., from about 1 toabout 200, from about 10 to about 200, from about 25 to about 200, fromabout 40 to about 200, from about 50 to about 200, from about 60 toabout 200, from about 1 to about 100, from about 10 to about 100, fromabout 20 to about 100, from about 30 to about 100, from about 40 toabout 100, from about 50 to about 100, from about 1 to about 60, fromabout 10 to about 60, from about 20 to about 60, from about 30 to about60, from about 40 to about 60, from about 1 to about 35, from about 5 toabout 35, from about 10 to about 35, from about 20 to about 35, fromabout 25 to about 35, from about 1 to about 25, from about 10 to about25, from about 15 to about 25, from about 2 to about 15, from about 5 toabout 15, etc. base pairs).

In many instances, CRISPR complexes bind with high affinity to thetarget locus. In many such instances, when double-stranded breaks at thetarget locus are desired CRISPR complexes will be directed to the targetlocus in a manner such that they do not stericly interfere with eachother. Thus, the invention includes methods in which CRISPR complexbinding sites at a target locus are selected such that nicking activityon each strand is not significantly altered by the binding of a CRISPRcomplex directed to the nicking of the other strand. The inventionfurther includes compositions for performing such methods.

TABLE 3 Predicted S. pyogenes Cas9 Functional Regions DescriptionPositions Length RuvC-I  1-62 62 Recognition lobe  60-718 659 RuvC-II718-765 48 HNH 810-872 63 RuvC-III  925-1102 178 PAM-interacting domain1099-1368 270 PAM substrate binding 1125-1127 3

S. pyogenes Cas9 protein has a number of domains (see Table 3), two ofwhich are nuclease domains. The discontinuous RuvC-like domain isencompassed by approximately amino acids 1-62, 718-765 and 925-1102. TheHNH nuclease domain is encompassed by approximately amino acids residues810-872. The recognition lobe, approximately amino acids 60-718,recognizes and binds regions of guide RNAs in a sequence-independentmanner. Deletions of some parts of this lobe abolishes CRISPR activity.The PAM-interacting domain, approximately amino acids 1099-1368,recognizes the PAM motif.

The nicking activity may be accomplished in a number of ways. Forexample, the Cas9 protein has two domains, termed RuvC and HNH, thatnick different strands of double-stranded nucleic acid. Cas9 proteinsmay be altered to inactivate one domain or the other. The result is thattwo Cas9 proteins are required to nick the target locus in order for adouble-stranded break to occur. For example, an aspartate-to-alaninesubstitution (D10A) in the RuvC catalytic domain of Cas9 from S.pyogenes converts Cas9 from a nuclease that cleaves both strands to anickase (cleaves a single strand). Other examples of mutations thatrender Cas9 a nickase include H840A, N854A, and N863A.

CRISPR proteins (e.g., Cas9) with nickase activities may be used incombination with guide sequences (e.g., two guide sequences) whichtarget respectively sense and antisense strands of the DNA target.

Another way to generate double-stranded breaks in nucleic acid usingnickase activity is by using CRISPR proteins that lack nuclease activitylinked to a heterologous nuclease domain. One example of this is amutated form of Cas9, referred to as dCas9, linked to FokI domain. FokIdomains require dimerization for nuclease activity. Thus, in suchinstances, CRISPR RNA molecules are used to bring two dCas9-FokI fusionproteins into sufficiently close proximity to generate nuclease activitythat results in the formation of a double-stranded cut. Methods of thistype are set out in Tsai et al., “Dimeric CRISPR RNA-guided FokInucleases for highly specific genome editing,” Nature Biotech.,32:569-576 (2014) and Guilinger et al., “Fusion of catalyticallyinactive Cas9 to FokI nuclease improves the specificity of genomemodification,” Nature Biotech., 32:577-582 (2014).

Transient Activity

One need is for a genome editing system having transient or highlyregulatable activity. Transient activity is important for a number ofapplications. For example, for construction of cells lines involving oneor more nuclease activity. Once a cellular nucleic acid, for example,has been effectively exposed to a nuclease and appropriately cut, repairof the nucleic acid (e.g., via non-homologous end-joining) normallytakes place. Repair of the cellular nucleic acid is generally requiredfor the cell to remain viable. In many cases, the cell will eitherintegrate nucleic acid into the repaired nucleic acid molecule ornucleic acid will be removed (e.g., from 1 base pair to about 100 basepairs) for the repaired nucleic acid molecule. In either instance, aheritable change occurs within the genome of the cell. Cells withgenetic changes can then be screened to identify ones with a desiredalteration. Once cells with desired changes are identified, for mostapplications, it is beneficial to maintain the cells without furthernuclease induced genetic change. Thus, it is generally desirable thatthe nuclease activity used to facilitate the genetic changes not beactive within the cells.

Transient activity can be achieved in a number of ways, some of whichare represented in FIG. 5. CRISPR systems typically require that allnecessary components be present for activity. Using a CRISPR/Cas9 systemfor reference, a target nucleic acid molecule must be contacted with aCas9 protein and one or more CRISPR nucleic acid molecules (e.g., either(1) a crRNA molecule and a tracrRNA molecule or (2) a guide RNAmolecule).

The invention thus includes compositions and methods for transientCRISPR mediate activities (e.g., nuclease activity). Transient activitymay be the generated in any number of ways. One feature of CRISPRsystems is that all components typically need to come together foractivity. These components are (1) one or more CRISPR proteins (e.g.,Cas9), (2) Hybridization Region 1 (e.g., crRNA), and (3) nucleic acidthat associates with both Hybridization Region 1 and the one or moreCRISPR proteins. Thus, if one or more components required for CRISPRmediate activity is removed, then the activity is inhibited.

Using the Cas9 based CRISPR system for purposes of illustration, threecomponents are required for CRISP mediated activity: (1) Cas9 protein,(2) crRNA, and (3) tracrRNA. Thus, transient systems can be generated bythe time limited presence of any one of these components. A number ofvariations are represented in FIG. 5.

TABLE 4 Exemplary CRISPR Components Format 1 Format 2 Row Cas9 ProteincrRNA tracrRNA Guide RNA No. (Col. A) (Col. B) (Col. C) (Col. D) 1Integrated Integrated Integrated Integrated Coding Seq. Coding Seq.Coding Seq. Coding Seq. 2 Protein crRNA tracrRNA Guide RNA 3 LinearLinear Linear Linear Coding Seq. Coding Seq. Coding Seq. Coding Seq. 4Vector Vector Vector Vector 5 mRNA — — —

As noted above, in Cas9 mediated system, Cas9 protein must be presentfor activity. Further, proteins normally are fairly stable moleculeswithin cells. Cas9 proteins may be modified to enhance intracellulardegradation (e.g., proteosome mediated degradation) by, for example,ubiquitination.

Cas9 protein may be either introduced into cells (Row 2, Column A) orproduced intracellularly (Rows 1, 3, 4, and 5, Column A). Further, theduration of time that Cas9 protein is taken up or producedintracellularly and the amount that is present intracellularly may becontrolled or regulated. As an example, a chromosomally integrated Cas9protein coding sequence may be operably linked to a regulatablepromoter. Further, the amount of mRNA encoding Cas9 protein introducedinto cells may be regulated.

With respect to non-coding CRISPR RNA needed to high level CRISPRactivity, at least two formats are possible: (1) separate crRNA andtracrRNA molecules and (2) Guide RNA (see Table 4).

The invention thus includes compositions and method for transientproduction of CRISPR mediated activities within cells. Such methodsinclude, for example, the use of a combination of stable and unstableCRISPR system components. One example is a system where mRNA encodingwild-type Cas9 protein and a guide RNA are introduced into a cell inroughly equal amounts. In this example, the presence of Cas9 mRNA willresult in the production of a stable Cas9 protein and the limitingfactor on CRISPR mediated activity will typically be the determined bythe amount of guide RNA present and guide RNA degradation.

The production and/or intracellular introduction of various componentsof CRISPR mediated systems in a number of ways. For example, a celldesigned for convenient CRISPR system reconstitution could be produced.One example of such a cell would be a mammalian cell line (e.g., CHO,293, etc.) that contains nucleic acid encoding Cas9 protein and tracrRNAintegrated into the genome. CRISPR mediated activities can then bedirected to a specific target sequence by the introduction into the cellline (e.g., via transfection) of crRNA. In such an exemplary cell line,Cas9 and/or tracrRNA coding sequences may be constitutively expressed orregulatably expressed (e.g., operably linked to an inducible or arepressible promoter).

The invention thus includes cell lines (e.g., eukaryotic cells lines)that contain one or more component of a CRISPR system, as well asmethods for directing one or more CRISPR mediated activity to specifictarget loci within such cells. In many instances, this will result fromthe addition to or production of at least one additional component thatresults in target locus CRISPR mediated activities within the cell.

Hybridization Region 1 (HR1)

HR1 (also referred to as Target Complementary crRNA) is believed todetermine the target nucleic acid sequence to which the CRISPR complexassociates with. HR1 may vary in length, nucleotide composition (e.g.,AT/CG ratio), and level of sequence complementarity with the targetsequence (e.g., 100%).

As noted above, the length of HR1 may vary. The length of HR1 isdetermined by the number of nucleotides of sequence complementarity totarget nucleic acid, not including internal mismatches. For example, ifthe crRNA or guide RNA has a twenty-two nucleotide region where the ten5′ most terminal nucleotide and the ten 3′ most terminal nucleotides are100% complementary to the sequence of a target nucleic acid, then theHR1 region is twenty-two nucleotides in length with two internalmis-matches. In such an instance, HR1 would share about 91% sequencecomplementarity with the sequence of the target nucleic acid.

HR1 used in compositions and methods of the invention may vary fromabout 12 nucleotides to about 35 nucleotides (e.g., from about 13nucleotides to about 33 nucleotides, from about 15 nucleotides to about33 nucleotides, from about 17 nucleotides to about 33 nucleotides, fromabout 18 nucleotides to about 33 nucleotides, from about 19 nucleotidesto about 33 nucleotides, from about 20 nucleotides to about 33nucleotides, from about 21 nucleotides to about 33 nucleotides, fromabout 13 nucleotides to about 30 nucleotides, from about 15 nucleotidesto about 30 nucleotides, from about 18 nucleotides to about 30nucleotides, from about 20 nucleotides to about 30 nucleotides, fromabout 13 nucleotides to about 27 nucleotides, from about 15 nucleotidesto about 27 nucleotides, from about 18 nucleotides to about 27nucleotides, from about 20 nucleotides to about 27 nucleotides, fromabout 13 nucleotides to about 25 nucleotides, from about 15 nucleotidesto about 25 nucleotides, from about 17 nucleotides to about 25nucleotides, from about 18 nucleotides to about 25 nucleotides, fromabout 20 nucleotides to about 25 nucleotides, from about 13 nucleotidesto about 23 nucleotides, from about 15 nucleotides to about 23nucleotides, from about 18 nucleotides to about 23 nucleotides, fromabout 20 nucleotides to about 23 nucleotides, etc.).

HR1 may be designed with sequence complementarity to target nucleic acidwith particular ratios AT/CG. AT/CG may be altered to adjusthybridization “affinity” between HR1 and the specific target nucleicacid. A-T pairs hybridize less tightly than C-G pairs. Thus,hybridization strength can be varied by altering the AT/CG ratio of HR1.In some instance, higher binding affinity and in some instances lowerbinding affinity may be desired.

Further, crRNA and guide RNA molecules may be designed with AT/CGcontents for the reduction of off target effects. The human genome, forexample, has an average CG content of around 41 to 42%. Thus, nucleicacids containing an HR1 with a CG content of greater or less than 41 to42% are less likely to share significant sequence complementarity withnucleic acid other than intended the target nucleic acid. Also, feweroff target effects would be expected the further the AT/CG ratio of HR1and the target nucleic acid are from the average AT/CG ratio of thegenome or other nucleic acid molecule being altered.

TABLE 5 Genomic CG Content of Select Eukaryotes Genome Avg. CG ContentHomo sapiens 41 to 42% Arabidopsis thaliana ~36% Saccharomycescerevisiae ~38% Plasmodium falciparum ~20%

HR1 used in compositions and methods of the invention thus may haveAT/CG ratios in the range of from about 1:5 to about 5:1 (e.g., fromabout 1:4 to about 5:1, from about 1:3 to about 5:1, from about 1:2 toabout 5:1, from about 1:1 to about 5:1, from about 1:5 to about 4:1,from about 1:4 to about 4:1, from about 1:3 to about 4:1, from about 1:2to about 4:1, from about 1:1 to about 4:1, from about 1:5 to about 3:1,from about 1:4 to about 3:1, from about 1:3 to about 3:1, from about 1:2to about 3:1, from about 1:1 to about 3:1, from about 1:5 to about 2:1,from about 1:4 to about 2:1, from about 1:3 to about 2:1, from about 1:2to about 2:1, or from about 1:1 to about 2:1).

Binding affinity between HR1 and the target nucleic acid can be variedby a combination of HR1 length, AT/CG content, and percent sequencecomplementarity. In most instances, sequence between HR1 and the targetnucleic acid will be 100% but this can vary between from about 80% toabout 100%, from about 90% to about 100%, from about 95% to about 100%,from about 80% to about 95%, from about 85% to about 95%, or from about90% to about 95%. HR1 used in compositions and methods of the inventionmay have sequence complementarity characteristics referred to above.

HR1 may also be designed using bioinformatic data to limit off-targeteffects. Complete genome sequence data is available for thousands ofgenomes. When a CRISPR system is engineered to modify the genome of aspecific organism, the genome of that organism (assuming the genomesequence is known) may be analyzed the select a region that is uniqueand/or has no counter-part region with a sequence similar enough forsubstantial levels of CRISPR complex binding. This may be done through acombination of site selection and preparation of HR1 to binding to theselected site.

Hybridization Region 2 (HR2)

HR2 is a region of sequence complementarity either (1) between the crRNAand the tracrRNA or (2) within the guide RNA. In a guide RNA, thisregion forms a hairpin (Hairpin Region 1 in FIG. 3).

CRISPR Proteins

Depending upon the type of CRISPR system, one or more CRISPR proteins(e.g., Cas9) may be used. These CRISPR proteins are targeted to a firstnucleic acid of defined sequence (a target locus) by a second nucleicacid and function either alone or in conjunction with other proteins.Thus, the CRISPR complex is a nucleic acid guided, nucleic acidrecognition system.

CRISPR proteins or protein complexes will typically have bindingactivity for one or more CRISPR oligonucleotides and a nucleic acidmodification activity (e.g., recombinase activity, methylase activity,etc.). Further, a nuclear localization signal may be present in CRISPRproteins or protein complexes, especially when (1) generated in or (2)designed or produced for introduction into a eukaryotic cell.

Thus, CRISPR proteins may be fusion proteins comprising, for example,the CRISPR protein or fragment thereof and an effector domain. Suitableeffector domains include, for example nucleic acid cleavage domains(e.g., heterologous cleavage domains such as the cleavage domain of theendonuclease FokI), epigenetic modification domains, transcriptionalactivation domains (e.g., a VP16 domain), and transcriptional repressordomains. Each fusion protein may be guided to a specific chromosomallocus, for example, by a specific guide RNA, wherein the effector domainmediates targeted genome modification or gene regulation.

In some aspects, the fusion proteins can function as dimers therebyincreasing the length of the target site and increasing the likelihoodof its uniqueness in the genome (thus, reducing off target effects). Forexample, endogenous CRISPR systems modify genomic locations based on DNAbinding word lengths of approximately 13-20 bp (Cong et al., Science,339:819-823 (2013).

CRISPR proteins may be synthesized and/or purified by any number ofmeans. In many instances, CRISPR proteins will be produced within thecell in which activity is desired. In some instances, CRISPR proteinsmay be produced extracellular to the cell in which activity is desiredand then introducing into the cell. Example of methods for producingsuch CRISPR proteins is by in vitro translation, extraction of theproteins from cell that express these proteins encoded by an expressionvector, and extraction of these proteins from cell that normally expressthem.

CRISPR Oligonucleotides

CRISPR oligonucleotides may be produced by a number of methods and maybe generated to have varying features. In many instances, CRISPRoligonucleotides will be one component or two components. By “onecomponent” is meant that only one oligonucleotide (e.g., guide RNA) isnecessary for CRISPR activity. By “two components” is meant that onlytwo different oligonucleotides (e.g., crRNA and tracrRNA) are requiredfor CRISPR activity. CRISPR systems with more than two components mayalso be designed, produced and used. Thus, the invention contemplatesmulti-components CRISPR oligonucleotides where functionality involvesthree, four, five, etc. oligonucleotides.

In some instances, two or more oligonucleotides may be generatedseparately and then joined to each other to form, for example, oneoligonucleotide that functions as part of a CRISPR system. The number ofcomponents of a system is determined by interaction with Cas9. As anexample, if two oligonucleotides are produced and then joined prior tointroduction into a cell, where the joined oligonucleotide requires noadditional oligonucleotides to facilitate a CRISPR mediated activity,then this is said to be a one component system.

Of course, the nucleotide sequences and other features of CRISPRoligonucleotides may vary with specific systems and desired functions.Common features of CRISPR oligonucleotides include association with oneor more CRISPR complex protein (e.g., Cas9) and nucleic acid “targeting”capability.

The invention thus includes compositions and methods for the productionof CRISPR oligonucleotides, as well as collections of oligonucleotidesgenerated, for example, using such compositions and methods.

In some embodiments, compositions and methods of the invention aredirected to one of or a combination of molecular biology synthesis(e.g., PCR) and/or chemical synthesis for the generation of CRISPRoligonucleotides. Using the schematic representation shown in FIG. 6 forreference, two chemically synthesized oligonucleotides encodingcomponents of a guide RNA may be designed to hybridize with each otherand be extended to form a fully double-stranded nucleic acid molecule(e.g., DNA).

FIG. 6 shows an exemplary workflow of the invention. The schematic inFIG. 6 shows oligonucleotides designed to generate a DNA molecule wherethe guide RNA coding region is operably linked to a T7 promoter. In thiswork flow DNA oligonucleotides either alone or in conjunction withdouble-stranded DNA are used to generate, via PCR, a DNA moleculeencoding a guide RNA operably linked to a promoter suitable for in vitrotranscription. The DNA molecule is then transcribed in vitro to generateguide RNA. The guide RNA may then be “cleaned up” by, for example,column purification or bead based methods. The guide RNA is thensuitable for use by, as examples, (1) direct introduction into a cell or(2) introduction into a cell after being complexed with one or moreCRISPR protein. Nucleic acid operably connected to a T7 promoter can betranscribed in mammalian cells when these cells contain T7 RNApolymerase (Lieber et al., Nucleic Acids Res., 17: 8485-8493 (1989)). Ofcourse, other promoters functional in eukaryotic cells (e.g., CMVpromoter, U6 promoter, H1 promoter, etc.) could also be used for theintracellular production of guide RNA. The H1 promoter, for example, isabout 300 base pairs in length. One advantage of the T7 promoter is itssmall size (20 base pairs). On specific T7 promoter that may be used incompositions and methods of the invention include those having thefollowing nucleotide sequence: GAAATTAATACGACTCACTATAG (SEQ ID NO: 12).

The T7 promoter may also be used to generate guide RNA in an in vitrotranscription system. In this instance, the double-stranded nucleic acidmolecule would be used to generate guide RNA extracellularly forintroduction into a cell.

Advantages of the guide RNA generation methods set out in FIG. 6 arespeed and low cost of production. In particular, once a target sequencehas been identified the Forward Oligo may be generated and combined withthe Reverse Oligo in a reaction mixture designed to extend each of theoligonucleotides to form the double-stranded nucleic acid molecule (seeFIG. 7). The Forward Oligo encodes the crRNA sequence designed withsequence complementarity to the target locus. Further, the Reverse Oligohas a sequence that is common to guide RNAs. The Reverse Oligo may begenerated by any means and stored as a standard component. The ForwardOligo, however, is target sequence specific so it must be designed inview of the target locus.

Two oligonucleotides suitable for the generation of double-stranded DNAsuitable for transcription as set out in FIG. 6 are shown in FIG. 7. Inthe schematic of FIG. 7, the “Forward Oligo” is tailored for the targetlocus because it contains Hybridization Region 1 of the target specificcrRNA. The “Reverse Oligo” contains regions of the tracrRNA and crRNAthat are not target locus specific. Thus, the “Reverse Oligo” can be a“stock” component. The invention thus includes compositions and methodsfor the formation of guide RNA molecules. Methods of this aspect of theinvention may comprise one or more of the following,

-   -   a. identification of a target locus,    -   b. the in silico design of one or more CRISPR RNA molecules with        sequence complementarity to that locus,    -   c. the production of a first oligonucleotide with a promoter        sequence, a region of sequence complementarity (e.g., 15 to 25        nucleotides in length) to the target locus,    -   d. incubating the first oligonucleotide with (i) a second        oligonucleotide and (ii) a polymerase under conditions suitable        for performing polymerase chain reaction (PCR) to generate a        double-stranded nucleic acid molecule, wherein the first        oligonucleotide and the second oligonucleotide have a region of        sequence complementarity of sufficient length to allow for        hybridization between the two oligonucleotides, and    -   e. performing an in vitro transcription reaction on the PCR        generated double-stranded nucleic acid molecule to produce a        guide RNA molecule, and    -   f. purifying the guide RNA molecule from the other components of        the reaction mixture.

In the work flow shown in FIG. 8, two oligonucleotides with the T7promoter and Hybridization Region 1 nucleic acid and a Double-StrandedNucleic Acid Segment are assembled by PCR. In this workflow, theDouble-Stranded Nucleic Acid Segment has a constant sequence and, thus,can be a stock component.

The two oligonucleotides form the full length double-stranded nucleicacid segment via a polymerase mediated assembly reaction. Once the fulllength product molecule is assembled, further PCR reactions amplify theproduct. The primers prevent the two oligonucleotides from being PCR“limiting” components. In other words, once the product nucleic acidmolecule has been generated, the primers allow for amplification tocontinue after the first and second oligonucleotides have been consumed.

FIG. 9 shows a process similar to that represented in FIG. 8 but theassembly reaction links two double-stranded nucleic acid segments andinserts specific nucleic acid in between them. Thus, the methodrepresented in FIG. 9 is especially useful for the insertion of anucleic acid segment of designed sequence between to selected nucleicacid molecules.

With respect to CRISPR RNA coding sequence construction, the FirstOligonucleotide and the Second Oligonucleotide may be synthesized tohybridize with the First Nucleic Acid Segment and the Second NucleicAcid Segment. Each of these oligonucleotides also encode all or part ofHybridization Region 1. Assembly reactions may thus be designed togenerate, for example, a DNA molecule that encodes a target locusspecific guide RNA operably linked to a promoter.

While only one oligonucleotide is required for assembly reactions of thetype shown in FIG. 9, two will generally be used because crRNAHybridization Region 1 is typically about 20 bases in length and about15 bases of sequence identity is desired for efficient hybridization tothe First Nucleic Acid Segment and the Second Nucleic Acid Segment.While oligonucleotides of 45 to 55 bases can be chemically synthesized,sequence fidelity often drops with length. The introduction of crRNAHybridization Region 1 segment with low sequence results in two issues:(1) An increase in “off-target” effects can occur due to theHybridization Region 1 associating with loci other that the desiredtarget locus and (2) decreased target locus interaction efficiency ofthe encoded guide RNAs.

The second issue above occurs when heterogeneous PCR assembled nucleicacid (e.g., DNA) are transcribed (e.g., via in vitro transcription) andthen introduced into cells. In general, the lower the level of sequencefidelity in the original assembly oligonucleotide population, thegreater the variation in Hybridization Region 1 of the expressed guideRNA population. One way to address this problem is to useoligonucleotides generated with high sequence fidelity.

FIG. 9 represents a design for synthetic guide RNA expression cassetteassembly. In this design, target specific variable crRNA region isencoded by the 35 to 40 base pair DNA oligo (represented here has firstand second oligo). All the remaining DNA oligos and double stranded DNAsegments may be constant components. These constant componentsinclude 1) a first double stranded nucleic acid segment encoding, forexample, an RNA polymerase III promoter that can be leveraged forexpressing the non-coding guide RNA component in vivo 2) a second doublestranded nucleic acid segment encoding the tracrRNA component, and 3) 5′and 3′ primers for amplification and enrichment of full length guide RNAexpression templates containing the RNA polymerase III promoter. Fulllength guide RNA expression cassette containing relevant RNA polymeraseIII promoter is generated by assembly PCR using the double strandednucleic acid segments, target specific overlapping oligos and flankingPCR primers. Assembly PCR is performed using a Taq DNA polymerase (e.g.,PHUSION® Taq DNA polymerase), with the resulting product being columnpurified prior to delivery into host cell line of interest. Methods suchas this can also be used to generate guide RNA expression cassettescontaining any user defined promoter.

The invention further includes compositions and methods for the assemblyof CRISPR RNA molecules (e.g., guide RNA molecules). CRISPR RNAmolecules may be assembled by the connection of two or more (e.g., two,three, four, five, etc.) RNA segments with each other. In particular,the invention includes methods for producing nucleic acid molecules,these methods comprising contacting two or more linear RNA segments witheach other under conditions that allow for the 5′ terminus of a firstRNA segment to be covalently linked with the 3′ terminus of a second RNAsegment.

This form of assembly has the advantage that it allows for rapid andefficient assembly of CRISPR RNA molecules. Using the schematic shown inFIG. 10 for purposes of illustration, guide RNA molecules withspecificity for different target sites can be generated using a singletracrRNA molecule/segment connected to a target site specific crRNAmolecule/segment. FIG. 10 shows four tubes with different crRNAmolecules with crRNA molecule 3 being connected to a tracrRNA moleculeto form a guide RNA molecule. Thus, FIG. 10 shows the connection of twoRNA segments to for a product RNA molecule. Thus, the invention includescompositions and methods for the connection (e.g., covalent connection)of crRNA molecules and tracrRNA molecules.

The invention also includes compositions and methods for the productionof guide RNA molecules with specificity for a target site, the methodcomprising: (1) identification of the target site, (2) production of acrRNA segment, and (3) connection of the crRNA segment with a tracrRNAsegment. In such methods, the tracrRNA segment may be produced prior toconnection with the crRNA and stored as a “stock” component or thetracrRNA segment may be generated from a DNA molecule that encodes thetracrRNA.

RNA molecules/segments connected to each other in the practice of theinvention may be produced by any number of means, including chemicalsynthesis and transcription of DNA molecules. In some instances, RNAsegments connected to each other may be produced by different methods.For example, a crRNA molecule produced by chemical synthesis may beconnected to a tracrRNA molecule produced by in vitro transcription ofDNA or RNA encoding the tracrRNA.

RNA segments may also be connected to each other by covalent coupling.RNA ligase, such as T4 RNA ligase, may be used to connect two or moreRNA segments to each other. When a reagent such as an RNA ligase isused, a 5′ terminus is typically linked to a 3′ terminus. If twosegments are connected, then there are two possible linear constructsthat can be formed (i.e., (1) 5′-Segment 1-Segment 2-3′ and (2)5′-Segment 2-Segment 1-3′). Further, intramolecular circularization canalso occur. Both of these issues can be addressed by blocking one 5′terminus or one 3′ terminus so that RNA ligase cannot ligate theterminus to another terminus. Thus, if a construct of 5′-Segment1-Segment 2-3′ is desired, then placing a blocking group on either the5′ end of Segment 1 or the 3′ end of Segment 2 will result in theformation of only the correct linear ligation product and will preventintramolecular circularization. The invention thus includes compositionsand methods for the covalent connection of two nucleic acid (e.g., RNA)segments. Methods of the invention include the use of an RNA ligase todirectionally ligate two single-stranded RNA segments to each other.

One example of an end blocker that may be used in conjunction with, forexample, T4 RNA ligase is a dideoxy terminator.

T4 RNA ligase catalyzes the ATP-dependent ligation of phosphodiesterbonds between 5′-phosphate and 3′-hydroxyl termini. Thus, when one usesT4 RNA ligase, suitable termini must be present on the termini beingligated. One means for blocking T4 RNA ligase on a terminus is byfailing to have the correct terminus format. In other words, termini ofRNA segments with a 5-hydroxyl or a 3′-phosphate will not act assubstrates for T4 RNA ligase.

Another method that may be used to connect RNA segments is by “clickchemistry” (see, e.g., U.S. Pat. Nos. 7,375,234 and 7,070,941, and USPatent Publication No. 2013/0046084, the entire disclosures of which areincorporated herein by reference). For example, one click chemistryreaction is between an alkyne group and an azide group (see FIG. 11).Any click reaction can be used to link RNA segments (e.g.,Cu-azide-alkyne, strain-promoted-azide-alkyne, staudinger ligation,tetrazine ligation, photo-induced tetrazole-alkene, thiol-ene, NHSesters, epoxides, isocyanates, and aldehyde-aminooxy). Ligation of RNAmolecules using a click chemistry reaction is advantageous because clickchemistry reactions are fast, modular, efficient, often do not producetoxic waste products, can be done with water as a solvent, and can beset up to be stereospecific.

In one embodiment the present invention uses the “Azide-Alkyne HuisgenCycloaddition” reaction, which is a 1,3-dipolar cycloaddition between anazide and a terminal or internal alkyne to give a 1,2,3-triazole for theligation of RNA segments. One advantage of this ligation method is thatthis reaction can initiated by the addition of required Cu(I) ions.

Other mechanism by which RNA segments may be connected include the useof halogens (F-, Br-, I-)/alkynes addition reactions,carbonyls/sulfhydryls/maleimide, and carboxyl/amine linkages.

For example, one RNA molecule may be modified with thiol at 3′ (usingdisulfide amidite and universal support or disulfide modified support),and the other RNA molecule may be modified with acrydite at 5′ (usingacrylic phosphoramidite), then the two RNA molecules can be connected byMichael addition reaction. This strategy can also be applied toconnecting multiple RAN molecules stepwise.

The invention also includes methods for linking more than two (e.g.,three, four, five, six, etc.) RNA molecules to each other. One reasonthis may be done is when an RNA molecule longer than about 40nucleotides is desired, as noted elsewhere herein, chemical synthesisefficiency degrades.

By way of example, a tracrRNA is typically around 80 nucleotides. SuchRNA molecules may be produced by processes such as in vitrotranscription or chemical synthesis. When chemical synthesis is used toproduce such RNA molecules, they may be produced as a single synthesisproduct or by linking two or more synthesized RNA segments to eachother. Further, when three or more RNA segments are connected to eachother, different methods may be used to link the individual segmentstogether. Also, the RNA segments may be connected to each other in one“pot”, all at the same time, or in one “pot” at different times or indifferent “pots” at different times.

For purposes of illustration, assume one wishes to assemble RNA Segments1, 2 and 3 in numerical order. RNA Segments 1 and 2 may be connected, 5′to 3′, to each other. The reaction product may then be purified forreaction mixture components (e.g., by chromatography), then placed in asecond vessel, “pot”, for connection of the 3′ terminus with the 5′terminus of RNA Segment 3. The final reaction product may then beconnected to the 5′ terminus of RNA Segment 3.

A second, more specific illustration of one embodiment of the inventionis as follows. RNA Segment 1 (about 30 nucleotides) is the target locusrecognition sequence of a crRNA and a portion of Hairpin Region 1. RNASegment 2 (about 35 nucleotides) contains the remainder of HairpinRegion 1 and some of the linear tracrRNA between Hairpin Region 1 andHairpin Region 2. RNA Segment 3 (about 35 nucleotides) contains theremainder of the linear tracrRNA between Hairpin Region 1 and HairpinRegion 2 and all of Hairpin Region 2. In this illustration, RNA Segments2 and 3 are linked, 5′ to 3′, using click chemistry. Further, the 5′ and3′ end termini of the reaction product are both phosphorylated. Thereaction product is then contacted with RNA Segment 1, having a 3′terminal hydroxyl group, and T4 RNA ligase to produce a guide RNAmolecule.

A number of additional linking chemistries may be used to connect RNAsegments according to method of the invention. Some of these chemistriesare set out in Table 6.

TABLE 6 Exemplary RNA Ligation Reactions Reaction Type Reaction SummaryThiol-yne

NHS esters

Thiol-ene

Isocyanates

Epoxy or aziridine

Aldehyde- aminoxy

Cu-catalyzed- azid-alkyne

Strain- promoted-azid- alkyne Cyclooctyne cycloaddition (with azide ornitrile oxide or nitrone)  

Norbornene cycloaddition (with azide or nitrile oxide or nitrone)  

Oxanorbornadiene cycloaddition  

Staudinger ligation

Tetrazine ligation

Photo-induced tetrazole-alken

[4 + 1] cycloaddition

Quadricyclane ligation

One issue with methods for linking RNA segments is that often they donot result in complete conversion of the segments to connected RNAmolecules. For example, some chemical linkage reactions only result in50% of the reactants forming the desired end product. In such instances,it will often be desirable to remove reagents and unreacted RNAsegments. This may be done by any number of means such as dialysis,chromatography (e.g., HPLC), precipitation, electrophoresis, etc. Thus,the invention includes compositions and method for linking RNA segments,where the reaction products RNA molecules are separated from otherreaction mixture components.

As noted above, CRISPR system components may be “generic” with respectto target loci (e.g., Cas9 protein) or may be specific for a particulartarget locus (e.g., crRNA). This allows for the production of “generic”components that may be used in conjunction with target sequence specificcomponents. Thus, when a target locus of interest is identified, oneneed only produce a component or components specific for that targetlocus. In the instance where one seeks to make two closely associated“nicks” at the target sequence, then, for example, two crRNA moleculeswill typically need to be produced. These crRNA molecules may beproduced when the target sequence of interest is identified or they maybe produced in advance and stored until needed.

The invention further includes collections of crRNA molecules withspecificity for individual target sites. For example, the inventionincludes collections of rRNA molecules with specificity for target siteswithin particular types of cell (e.g., human cells). The members of suchcollection of cells may be generated based upon sequence information forthese particular types of cells. As an example, one such collectioncould be generated using the complete genome sequence of a particulartype of cell. The genome sequence data can be used to generate a libraryof crRNA molecules with specificity for the coding region of each genewithin the human genome. Parameters that could be used to generate sucha library may include the location of protospacer adjacent motif (PAM)sites, off target effects (e.g., sequences unique to the target region),and, when gene “knockouts” are desired, locations within coding regionslikely to render the gene expression product fully or partiallynon-functional (e.g., active site coding regions, intron/exon junctions,etc.).

Collections or libraries of crRNA molecules or the invention may includea wide variety of individual molecules such as from about five to about100,000 (e.g., from about 50 to about 100,000, from about 200 to about100,000, from about 500 to about 100,000, from about 800 to about100,000, from about 1,000 to about 100,000, from about 2,000 to about100,000, from about 4,000 to about 100,000, from about 5,000 to about100,000, from about 50 to about 50,000, from about 100 to about 50,000,from about 500 to about 50,000, from about 1,000 to about 50,000, fromabout 2,000 to about 50,000, from about 4,000 to about 50,000, fromabout 50 to about 10,000, from about 100 to about 10,000, from about 200to about 10,000, from about 500 to about 10,000, from about 1,000 toabout 10,000, from about 2,000 to about 10,000, from about 4,000 toabout 10,000, from about 50 to about 5,000, from about 100 to about5,000, from about 500 to about 5,000, from about 1,000 to about 5,000,from about 50 to about 2,000, from about 100 to about 2,000, from about500 to about 2,000, etc.).

RNA molecules generated by and used in the practice of the invention maybe stored in a number of ways. RNA molecules are generally not as stableas DNA molecules and, thus, to enhance stability, RNA molecules may bestored at low temperature (e.g., −70° C.) and/or in the presence of oneor more RNase inhibitor (e.g., RNASEOUT™, RNASECURE™ Reagent, bothavailable from Thermo Fisher Scientific).

Further, RNA molecules may be chemically modified to be resistant toRNases by, for example, being generated using RNase-resistantribonucleoside triphosphates. Examples of RNase-resistant modifiedribonucleosides include, but are not limited to, 2-fluororibonucleosides, 2-amino ribonucleosides, and 2-methoxy ribonucleosides.Additional examples of RNase-resistant modified ribonucleosides aredisclosed in U.S. Patent Publ. 2014/0235505 A1, the entire disclosure ofwhich is incorporated herein by reference. 2′-O-allyl-ribonucleotidesmay also be incorporated into RNA molecules of the invention.

Chemical modification used in the practice of the invention will oftenbe selected based upon a series of criteria, such as effectiveness forthe purpose that the chemical modification is used (e.g., RNaseresistance), level of toxicity to cells (low generally being better thanhigh), ease of incorporation into the nucleic acid molecules, andminimal interference with the biological activities of the nucleic acidmolecule (e.g., the activities of a guide RNA molecule).

Further, RNA molecules of and used in the practice of the invention maybe stored in a number of different formats. For example, RNA moleculesmay be stored in tubes (e.g., 1.5 ml microcentrifuge tubes) or in thewells of plates (e.g., 96 well, 384 well, or 1536 well plates).

The invention thus includes compositions and methods for the productionof libraries and/or collections of CRISPR system components, as well asthe libraries and/or collections of CRISPR system components themselves.

The invention also includes compositions and methods for the isolationof gRNA molecules. Such methods will often be based upon hybridizationof a gRNA region to another nucleic acid molecule, followed byseparation of the hybridized complex from other molecules (e.g., nucleicacid molecules) present in a mixture.

As an example, beads containing a nucleic acid molecule with sequencehomology to a gRNA molecule may be used to purify the gRNA from asolution. In some instances, the bead will be a magnetic bead. Further,the nucleic acid molecule designed to hybridize to the gRNA molecule maybe designed with homology to a sequence present in gRNA molecules orgRNA molecules may be designed to contain a sequence that is used forhybridization. The invention thus includes gRNA molecules that aredesigned to contain what is effectively a hybridization “tag”.

Such “tags” are particularly useful in high throughput applications. Asan example, a 96 well plate may contain different gRNA molecules in eachwell, wherein each gRNA molecules contains the same tag. A magnetic beadmay be placed in one or more well of the plate and then removed after aspecified period of time to allow for gRNA/bead bound hybridization totake place. These beads may then be individually placed in wells ofanother plate containing cells and donor DNA under conditions that allowfor release of gRNA molecules from the beads (e.g., competition with anoligonucleotide of identical or similar sequence to the tag).

As noted above, hybridization tags may be naturally resident with gRNAmolecules or may be introduced into or added to gRNA molecules. Suchtags may be added by the alteration of a region present in a gRNAmolecule or may be added to the gRNA either internally or at a terminus.Further, tags may be generated during synthesis of gRNA molecules oradded after gRNA molecules are produced (e.g., via “click chemistry”).

Hybridization tags will typically be less than 25 (e.g., from about 10to about 25, from about 15 to about 25, from about 16 to about 25, fromabout 10 to about 20, from about 15 to about 25, from about 15 to about20, etc.) bases in length. Such tags will typically be able to hybridizeto homologous sequences with sufficient affinity for association butwill not associate so strongly that they do not efficiently release whendesired. Further, shorter tags will often have a higher GC content. Inmany instances, tags will have a GC content of at least 45% (e.g., fromabout 45% to about 75%, from about 50% to about 75%, from about 55% toabout 75%, from about 60% to about 75%, from about 65% to about 75%,etc.).

Also, tagged gRNA molecules may contain a label. This label may be usedto quantify the amount of gRNA present. Labels may also be useful whenseeking to determine the amount of gRNA transferred by hybridizationbased means. Such labels may also be used to measure cellular uptake asset out elsewhere herein.

CRISPR Activities

CRISPR complexes of the invention can have any number of activities. Forexample, CRISPR proteins may be fusion proteins comprising one or moreheterologous protein domains (e.g., one, two, three, four, five, etc.).A CRISPR fusion protein may comprise any additional protein sequence,and optionally a linker sequence between any two domains. Examples ofprotein domains that may be fused to a CRISPR protein include, withoutlimitation, epitope tags, reporter gene sequences, and protein domainshaving one or more of the following activities: methylase activity,demethylase activity, transcription activation activity, transcriptionrepression activity, transcription release factor activity, histonemodification activity, RNA cleavage activity, and nucleic acid bindingactivity.

Non-limiting examples of epitope tags include histidine (His) tags, V5tags, FLAG tags, influenza hemagglutinin (HA) tags, Myc tags, VSV-Gtags, and thioredoxin (Trx) tags. Examples of reporter genes include,but are not limited to, glutathione-S-transferase (GST), horseradishperoxidase (HRP), chloramphenicol acetyltransferase (CAT)beta-galactosidase, beta-glucuronidase, luciferase, green fluorescentprotein (GFP), HcRed, DsRed, cyan fluorescent protein (CFP), yellowfluorescent protein (YFP), and autofluorescent proteins including bluefluorescent protein (BFP).

A CRISPR protein may be fused to a gene sequence encoding a protein or afragment of a protein that bind DNA molecules or bind other cellularmolecules, including but not limited to maltose binding protein (MBP),S-tag, Lex A DNA binding domain (DBD) fusions, GALA DNA binding domainfusions, and herpes simplex virus (HSV) BP16 protein fusions. Additionaldomains that may form part of a fusion protein comprising a CRISPRprotein are described in US 2011/0059502, incorporated herein byreference.

In particular, provided herein, in part, are CRISPR proteinendonucleases, which comprise at least one nuclear localization signal,at least one nuclease domain, and at least one domain that interactswith a guide RNA to target the endonuclease to a specific nucleotidesequence for cleavage. Also provided are nucleic acids encoding CRISPRprotein endonucleases, as well as methods of using CRISPR proteinendonucleases to modify chromosomal sequences of eukaryotic cells orembryos. CRISPR protein endonucleases interacts with specific guideRNAs, each of which directs the endonuclease to a specific targetedsite, at which site the CRISPR protein endonucleases introduces adouble-stranded break that can be repaired by a DNA repair process suchthat the chromosomal sequence is modified. Since the specificity isprovided by the guide RNA (or the crRNA), the CRISPR proteinendonucleases are universal and can be used with different guide RNAs totarget different genomic sequences. Methods disclosed herein can be usedto target and modify specific chromosomal sequences and/or introduceexogenous sequences at targeted locations in the genome of cells orembryos.

CRISPR complexes may also be employed to activate or represstranscription. For example, a dCas9-transcriptional activator fusionprotein (e.g., dCas9-VP64) may be used in conjunction with a guide RNAto activate transcription of nucleic acid associated with a targetlocus. Similarly, dCas9-repressor fusions (e.g., dCas9-KRABtranscriptional repressor) may be used to repress transcription ofnucleic acid associated with a target locus. Transcriptional activationand repression such as the referred to above are discussed in, forexample, Kearns et al., Cas9 effector-mediated regulation oftranscription and differentiation in human pluripotent stem cells,Development, 141:219-223 (2014).

The invention thus includes compositions and methods for the productionand use of CRISPR system components for the activation and repression oftranscription.

CRISPR Systems

CRISPR systems that may be used in the practice of the invention varygreatly. These systems will generally have the functional activities ofa being able to form complex comprising a protein and a first nucleicacid where the complex recognizes a second nucleic acid. CRISPR systemscan be a type I, a type II, or a type III system (see Table 2).Non-limiting examples of suitable CRISPR proteins include Cas3, Cas4,Cas5, Cas5e (or CasD), Cas6, Cas6e, Cas6f, Cas7, Cas8a1, Cas8a2, Cas8b,Cas8c, Cas9, Cas10, Cas1 Od, CasF, CasG, CasH, Csy1, Csy2, Csy3, Cse1(or CasA), Cse2 (or CasB), Cse3 (or CasE), Cse4 (or CasC), Csc1, Csc2,Csa5, Csn2, Csm2, Csm3, Csm4, Csm5, Csm6, Cmr1, Cmr3, Cmr4, Cmr5, Cmr6,Csb1, Csb2, Csb3, Csx17, Csx14, Csx10, Csx16, CsaX, Csx3, Csz1, Csx15,Csf1, Csf2, Csf3, Csf4, and Cu1966.

In some embodiments, the CRISPR protein (e.g., Cas9) is derived from atype II CRISPR system. In specific embodiments, the CRISPR system isdesigned to acts as an oligonucleotide (e.g., DNA or RNA)-guidedendonuclease derived from a Cas9 protein. The Cas9 protein for this andother functions set out herein can be from Streptococcus pyogenes,Streptococcus thermophilus, Streptococcus sp., Nocardiopsisdassonvillei, Streptomyces pristinaespiralis, Streptomycesviridochromogenes, Streptomyces viridochromogenes, Streptosporangiumroseum, Streptosporangium roseum, AlicyclobacHlus acidocaldarius,Bacillus pseudomycoides, Bacillus selenitireducens, Exiguobacteriumsibiricum, Lactobacillus delbrueckii, Lactobacillus salivarius,Microscilla marina, Burkholderiales bacterium, Polaromonasnaphthalenivorans, Polaromonas sp., Crocosphaera watsonii, Cyanothecesp., Microcystis aeruginosa, Synechococcus sp., Acetohalobiumarabaticum, Ammonifex degensii, Caldicelulosiruptor becscii, CandidatusDesulforudis, Clostridium botulinum, Clostridium difficile, Finegoldiamagna, Natranaerobius thermophilus, Pelotomaculumthermopropionicum,Acidithiobacillus caldus, Acidithiobacillus ferrooxidans, Allochromatiumvinosum, Marinobacter sp., Nitrosococcus halophilus, Nitrosococcuswatsoni, Pseudoalteromonas haloplanktis, Ktedonobacter racemifer,Methanohalobium evestigatum, Anabaena variabilis, Nodularia spumigena,Nostoc sp., Arthrospira maxima, Arthrospira platensis, Arthrospira sp.,Lyngbya sp., Microcoleus chthonoplastes, Oscillatoria sp., Petrotogamobilis, Thermosipho africanus, or Acaryochloris marina.

FIG. 12A-12F shows an alignment of Cas9 amino acid sequences. In manyinstances, the compositions and methods of the invention will bedirected to Type II CRISPR systems. In such instances, a number ofdifferent Cas9 proteins may be employed. Cas9 proteins may be defined,to some extent, by their regions of sequence homology. Proteins suitablefor use in compositions and methods of the invention will typicallyinclude those that have seven or more (e.g., from about seven to aboutfifteen, from about seven to about eleven, from about seven to aboutten, from about seven to about eight, etc.) amino acids identical to theS. pyogenes Cas9 amino acid sequence shown in FIG. 12A-12F. Additionalfeatures of CRISPR proteins that fall within the scope of the inventionare set out elsewhere herein.

Vector Components and Cells:

A number of functional nucleic acid components (e.g., promoters, polyAsignal, origins of replication, selectable markers, etc.) may be used inthe practice of the invention. The choice of functional nucleic acidcomponents used in the practice of the invention, when employed, willvary greatly with the nature of the use and the specifics of the system(e.g., intracellular, extracellular, in vitro transcription, coupled invitro transcription/translation, etc.).

Promoter choice depends upon a number of factors such as the expressionproducts and the type of cell or system that is used. For example,non-mRNA molecules are often production using RNA polymerase I or IIIpromoters. mRNA is generally transcribed using RNA polymerase IIpromoters. There are exceptions, however. One is microRNA expressionsystems where a microRNA can be transcribed from DNA using an RNApolymerase II promoter (e.g., the CMV promoter). While RNA polymerase IIpromoters do not have “sharp” stop and stop points, microRNAs tend to beprocessed by removal of 5′ and 3′ termini. Thus, “extra” RNA segments atthe termini are removed. mRNA (e.g., cas9 mRNA) is normally produced viaRNA polymerase II promoters.

The choice of a specific promoter varies with the particularapplication. For example, the T7, T3 and SP6 promoters are often usedfor in vitro transcription and in vitro transcription/translationssystems. When intracellular expression in desired, the promoter orpromoters used will generally be designed to function efficiently withinthe cells employed. The CMV promoter, for example, is a strong promoterfor use within mammalian cells. The hybrid Hsp70A-Rbc S2 promoter is aconstitutive promoter that functions well in eukaryotic algae such asChlamydomonas reinhardtii. (see the product manual “GeneArt®Chlamydomonas Protein Expression Kit”, cat. no. A24244, version B.0,from Life Technologies Corp., Carlsbad, Calif.). Additional promotersthat may be used in the practice of the invention include AOX1, GAP,cauliflower mosaic virus 35S, pGC1, EF1α, and Hsp70 promoters.

The DNA segment in the expression vector is operatively linked to anappropriate expression control sequence(s) (promoter) to direct RNAsynthesis. Suitable eukaryotic promoters include the CMV immediate earlypromoter, the HSV thymidine kinase promoter, the early and late SV40promoters, the promoters of retroviral LTRs, such as those of the RousSarcoma Virus (RSV), and metallothionein promoters, such as the mousemetallothionein-I promoter. Exemplary promoters suitable for use withthe invention are from the type III class of RNA polymerase IIIpromoters. Additionally, the promoters may be selected from the groupconsisting of the U6 and H1 promoters. The U6 and H1 promoters are bothmembers of the type III class of RNA polymerase III promoters.

RNA polymerase III promoters are suitable for in vivo transcription ofnucleic acid molecules produced by methods of the invention. Forexample, linear DNA molecules produced as set out in FIG. 9 may beintroduced into cells and transcribed by, for example, naturallyresident intracellular transcriptional processes.

Promoters in compositions and methods of the invention may also beinducible, in that expression may be turned “on” or “off.” For example,a tetracycline-regulatable system employing the U6 promoter may be usedto control the production of siRNA. Expression vectors may or may notcontain a ribosome binding site for translation initiation and atranscription terminator. Vectors may also include appropriate sequencesfor amplifying expression.

A great variety of cloning/expression systems can be used to expressproteins and nucleic acid molecules in the practice of the invention.Such vectors include, among others, chromosomal-, episomal- andviral-derived vectors, for example, vectors derived from plasmids, frombacteriophage, from transposons, from yeast episomes, from insertionelements, from yeast chromosomal elements, from viruses such asbaculoviruses, papova viruses, such as SV40, vaccinia viruses,adenoviruses, adeno-associated viruses, avipox (e.g., fowl pox) viruses,suipox viruses, capripox viruses, pseudorabies viruses, picornavirusesand retroviruses, and vectors derived from combinations thereof, such asthose derived from plasmid and bacteriophage genetic elements, such ascosmids and phagemids. The expression system constructs can containcontrol regions that regulate as well as engender expression. Generally,any system or vector suitable to maintain, propagate or expresspolynucleotides or to express a polypeptide in a host can be used forexpression in this regard. The appropriate DNA sequence can be insertedinto the expression system by any of a variety of well-known and routinetechniques, such as, for example, those set forth in Sambrook et al.,MOLECULAR CLONING: A LABORATORY MANUAL, 2nd Ed., Cold Spring HarbourLaboratory Press, Cold Spring Harbour. N.Y. (1989).

Cells suitable for use with the present invention include a wide varietyof prokaryotic and eukaryotic cells. In many instances, the cells one ormore CRISPR system component will not be naturally associated with thecell (i.e., will be exogenous to the cell).

Representative cells that may be used in the practice of the inventioninclude, but are not limited to, bacterial cells, yeast cells, plantcells and animal cells. Exemplary bacterial cells include Escherichiaspp. cells (particularly E. coli cells and most particularly E. colistrains DH10B, Stb12, DH5□, DB3, DB3.1), Bacillus spp. cells(particularly B. subtilis and B. megaterium cells), Streptomyces spp.cells, Erwinia spp. cells, Klebsiella spp. cells, Serratia spp. cells(particularly S. marcessans cells), Pseudomonas spp. cells (particularlyP. aeruginosa cells), and Salmonella spp. cells (particularly S.typhimurium and S. typhi cells). Exemplary animal cells include insectcells (most particularly Drosophila melanogaster cells, Spodopterafrugiperda Sf9 and Sf21 cells and Trichoplusa High-Five cells), nematodecells (particularly C. elegans cells), avian cells, amphibian cells(particularly Xenopus laevis cells), reptilian cells, and mammaliancells (more particularly NIH3T3, CHO, COS, VERO, BHK CHO-K1, BHK-21,HeLa, COS-7, HEK 293, HEK 293T, HT1080, PC12, MDCK, C2C12, Jurkat,NIH3T3, K-562, TF-1, P19 and human embryonic stem cells like clone H9(Wicell, Madison, Wis., USA)). Exemplary yeast cells includeSaccharomyces cerevisiae cells and Pichia pastoris cells. These andother cells are available commercially, for example, from Thermo-FisherScientific (Waltham, Mass.), the American Type Culture Collection, andAgricultural Research Culture Collection (NRRL; Peoria, Ill.). Exemplaryplant cells include cells such as those derived from barley, wheat,rice, soybean, potato, arabidopsis and tobacco (e.g., Nicotiana tabacumSR1).

Introduction of CRISPR System Components into Cells:

The invention also includes compositions and methods for introduction ofCRISPR system components into cells. Introduction of a molecules intocells may be done in a number of ways including by methods described inmany standard laboratory manuals, such as Davis et al., BASIC METHODS INMOLECULAR BIOLOGY, (1986) and Sambrook et al., MOLECULAR CLONING: ALABORATORY MANUAL, 2nd Ed., Cold Spring Harbour Laboratory Press, ColdSpring Harbour. N.Y. (1989), such as, calcium phosphate transfection,DEAE-dextran mediated transfection, transfection, microinjection,cationic lipid-mediated transfection, electroporation, transduction,scrape loading, ballistic introduction, nucleoporation, hydrodynamicshock, and infection.

The invention includes methods in which different CRISPR systemcomponents are introduced into cells by different means, as well ascompositions of matter for performing such methods. For example, alentiviral vector may be used to introduce Cas9 coding nucleic acidoperably linked to an suitable and guide RNA may be introduced bytransfection.

CRISPR system components may be the functional CRISPR system moleculesor they may be molecules encoding the functional molecules (e.g., DNA,RNA encoding Cas9, etc.) transfection of CRISPR system components intocells. Methods of the invention relate to the introduction into cellsone or more of the following:

-   -   a. Guide RNA,    -   b. crRNA,    -   c. tracrRNA,    -   d. DNA encoding Cas9 or dCas9 (as well as fusion proteins of        each), and    -   e. mRNA encoding Cas9 or dCas9 (as well as fusion proteins of        each).

In most instances, CRISPR system components will be introduced into acell in a manner that results in the generation of CRISPR activitieswithin the cell. Thus, in instances where a cell expresses Cas9 protein(e.g., from chromosomally integrated CRISPR encoding nucleic acidoperably linked to a promoter), crRNA and tracrRNA or guide may beintroduced into the cell by transfection.

Transfection agents suitable for use with the invention includetransfection agents that facilitate the introduction of RNA, DNA andproteins into cells. Exemplary transfection reagents include TurboFectTransfection Reagent (Thermo Fisher Scientific), Pro-Ject Reagent(Thermo Fisher Scientific), TRANSPASS™ P Protein Transfection Reagent(New England Biolabs), CHARIOT™ Protein Delivery Reagent (Active Motif),PROTEOJUICE™ Protein Transfection Reagent (EMD Millipore), 293fectin,LcIPOFECTAMINE™ 2000, LIPOFECTAMINE™ 3000 (Thermo Fisher Scientific),LIPOFECTAMINE™ (Thermo Fisher Scientific), LIPOFECTIN™ (Thermo FisherScientific), DMRIE-C, CELLFECTIN™ (Thermo Fisher Scientific),OLIGOFECTAMINE™ (Thermo Fisher Scientific), LIPOFECTACE™, FUGENE™(Roche, Basel, Switzerland), FUGENE™ HD (Roche), TRANSFECTAM™(Transfectam, Promega, Madison, Wis.), TFX-10™ (Promega), TFX-20™(Promega), TFX-50™ (Promega), TRANSFECTIN™ (BioRad, Hercules, Calif.),SILENTFECT™ (Bio-Rad), Effectene™ (Qiagen, Valencia, Calif.), DC-chol(Avanti Polar Lipids), GENEPORTER™ (Gene Therapy Systems, San Diego,Calif.), DHARMAFECT 1™ (Dharmacon, Lafayette, Colo.), DHARMAFECT 2™(Dharmacon), DHARMAFECT 3™ (Dharmacon), DHARMAFECT 4™ (Dharmacon),ESCORT™ III (Sigma, St. Louis, Mo.), and ESCORT™ IV (Sigma ChemicalCo.).

The invention further includes methods in which one molecule isintroduced into a cell, followed by the introduction of another moleculeinto the cell. Thus, more than one CRISPR system components molecule maybe introduced into a cell at the same time or at different times. As anexample, the invention includes methods in which Cas9 is introduced intoa cell while the cell is in contact with a transfection reagent designedto facilitate the introduction of proteins in to cells (e.g., TurboFectTransfection Reagent), followed by washing of the cells and thenintroduction of guide RNA while the cell is in contact withLIPOFECTAMINE™ 2000.

Conditions will normally be adjusted on, for example, a per cell typebasis for a desired level of CRISPR system component introduction intothe cells. While enhanced conditions will vary, enhancement can bemeasure by detection of intracellular CRISPR system activity. Thus, theinvention includes compositions and methods for measurement of theintracellular introduction of CRISPR system components in cells.

The invention also includes compositions and methods related to theformation and introduction of CRISPR complexes into cells. One exemplarymethod of the invention comprises:

-   -   a. forming a complex comprising at least one CRISPR system        protein with at least one CRISPR RNA,    -   b. contacting the complexed CRISPR system protein and RNA with a        cell,    -   c. incubating or culturing the resulting cell for a period of        time (e.g., from about 2 minutes to about 8 hours, from about 10        minutes to about 8 hours, from about 20 minutes to about 8        hours, from about 30 minutes to about 8 hours, from about 60        minutes to about 8 hours, from about 20 minutes to about 6        hours, from about 20 minutes to about 3 hours, from about 20        minutes to about 2 hours, from about 45 minutes to about 3        hours, etc.), and    -   d. measuring CRISPR system activity within the cell.

In some instances, during the practice of methods of the invention,molecules introduced into cells may be labeled. One schematic example ofthis is set out in FIG. 24 where donor DNA labeled with ALEXA FLUOR® 647dye (Thermo Fisher Scientific) and GFP-Cas9 RNP complexes aresequentially introduced into HEK293 cells, followed by cell sorting toobtain cells that contain both labels (lower right of FIG. 24). This isadvantageous because cells containing specific amounts of both labelsmay be separated from other cells to obtain a population of cells havingenhanced probabilities of undergoing genetic modification (e.g.,homologous recombination). A similar workflow is set out in FIG. 25.

Labels may be attached to one or more CRISPR system component and/orother molecules (e.g., a donor nucleic acid molecule) for introductionin the cells. In many instances, labels will be detectable eithervisually or by cell sorting instruments. Exemplary labels include cyanflorescent protein (CFP), green florescent protein (GFP), orangeflorescent protein (OFP), red florescent protein (RFP), and yellowflorescent protein (YFP). Additional labels include AMCA-6-dUTP,DEAC-dUTP, dUTP-ATTO-425, dUTP-XX-ATTO-488, Fluorescein-12-dUTP,Rhodamine-12-dUTP, dUTP-XX-ATTO-532, dUTP-Cy3, dUTP-ATTO-550, dUTP-TexasRed, dUTP-J647, dUTP-Cy5, dUTP-ATTO-647N, dUTP-ATTO-655,Fluorescein-12-dCTP, Rhodamine-12-dCTP, dCTP-Cy3dCTP-ATTO-550,dCTP-Texas Red, dCTP-J647, dCTP-Cy5 and dCTP-ATTO-647N available frommultiple sources including Jena Bioscience.

Labels may be located in nucleic acid molecules and proteins at one orboth termini and/or interior portions of the particular molecules.

When cells are sorted, a number of separation parameters may beemployed. In most instances, sorting may be designed to obtain cellshaving enhanced probability of undergoing genetic modification. Forexample, cells may be labeled as shown in FIG. 24 with differentcomponents required for genetic modification followed by cell sorting toobtain cells a specific amount of signal of each component. Using theschematic of FIG. 24 for purposes of illustration, cells may be selectedbased upon the number of donor DNA molecules and the number of GFP-Cas9RNP complexes present within the cells. This may be done by choosing aminimum signal level for each the two labels, resulting in those cellsbeing sorted as “positive” cells. Another sorting option is to score as“positive” cells that are in the top 3%, 5%, 8%, 10%, 15%, 20%, 25%,etc. for both signals as compared to all of the cells in a mixture.Assuming that two labels are equivalently and independently taken up,then scoring for the top 25% of cells for both labels would be expectedto yield 6.25% of the original population being sorted. These wouldlikely be the cells in the mixture that have the highest probabilitiesof undergoing genetic modification.

The invention thus includes methods, as well as compositions forperforming such methods, for obtaining cell populations wherein thecells therein have an enhanced probabilities of undergoing geneticmodification. In some instances, such methods will involve one or bothof the following: (1) selection of cell (e.g., via cell sorting) ofcells that have taken up one or more component necessary for geneticmodification (e.g., one or more CRISPR system component and one or moredonor DNA molecule) and (2) introduction of one or more one or morecomponent necessary for genetic modification into cells by processesdesigned to result in high cellular uptake (e.g., sequence componentintroduction, as set out herein).

As noted elsewhere herein, in some instances, sequential addition ofcomponents may be employed. As shown from the data set out in FIG. 26,sequential delivery of CRISPR system components and donor DNA generallyresults in higher efficiency levels of genetic modification thanco-delivery. While not wishing to be bound by theory, this is possiblylow uptake efficiency of combinations of CRISPR system components andnon-CRISPR complex bound nucleic acid molecules when co-delivered.

When sequential delivery is employed, various components may beintroduced into cells in a number of orders. For example, Cas9 protein,gRNA, or Cas9 RNP may be introduced into cells first followed by theintroduction of donor DNA. Of course, the reverse order may be used too.Further, Cas9 protein may be expressed within cells and gRNA and donorDNA may be co-delivered or sequentially delivered to the cells in anyorder. Additionally, gRNA may be expressed within cells and Cas9 proteinand donor DNA may be co-delivered or sequentially delivered to the cellsin any order. In some instances, gRNA may be introduced into cellsfirst, followed by Cas9 protein, then followed by donor DNA. Of course,other delivery orders may be used too, so long as all of the componentsrequired for genetic modification are not delivered simultaneously.

In some instances, methods of the invention include the contacting acell with a linear DNA segment that has sequence homology at bothtermini to the target locus (e.g., a donor DNA molecule) underconditions that allow for uptake of the linear DNA segment by the cell(e.g., in conjunction with electroporation, contacting with atransfection reagent, etc.), followed by contacting the cell with one ormore CRISPR system components (e.g., Cas9 mRNA, guide RNA, Cas9 mRNA andguide RNA, a Cas9 protein/guide RNA complex, etc.) under conditions thatallow for uptake of the one or more CRISPR system components by thecell.

In specific aspects, the invention includes methods comprising steps(a), (b) and (c) below. Furthers, step (c) and step (a) may be swappedin order.

Step (a) involves contacting a cell with a linear DNA segment that hassequence homology at both termini to the target locus (e.g., a donor DNAmolecule) under conditions that allow for uptake of the linear DNAsegment by the cell. This may be done in any number of means. Asexamples, cell may be subjected to electroporation in the presence ofthe linear DNA segment or a transfection reagent may be used.

Step (b) involves waiting a period of time. This time period may bedetermined in a number of ways.

Step (c) involves contacting a cell with a Cas9 RNP complex underconditions that allow for uptake of the linear DNA segment by the cell.

As shown in FIGS. 27 and 28, the amount and characteristics of nucleicacid molecules introduced into cells in conjunction with CRISPR systemcomponents may be adjusted to enhance genetic modification.

Data in FIG. 27 shows that, under the particular conditions used,efficient homologous recombination occurs with 0.2 and 0.5 μg of donorDNA per 10 μl reaction volume. The number of cells was between 50,000 to200,000. Further, homologous recombination decreases with donor DNAlevels lower and higher than those amounts. The invention this includescompositions and methods and where donor DNA concentrations are in therange of 0.01 μg/10 μl to 500 μg/10 μl (e.g., from about 0.01 to about400, from about 0.01 to about 200, from about 0.01 to about 100, fromabout 0.01 to about 50, from about 0.01 to about 30, from about 0.01 toabout 20, from about 0.01 to about 15, from about 0.01 to about 10, fromabout 0.1 to about 50, from about 0.1 to about 25, from about 0.1 toabout 15, from about 0.1 to about 10, from about 0.1 to about 50, fromabout 0.1 to about 5, from about 0.1 to about 1, from about 0.1 to about0.8, etc. μg/10 μl.

Data in FIG. 27 shows that, under the particular conditions used, theefficient homologous recombination varies with the length, amount andpresence or absence of certain chemical modifications. The lengthvariation may be partially due to the sizes of regions of homology tothe locus being modified. For example, about single-stranded donornucleic acid molecules of about 80 bases in length appear to besufficient for efficient homologous recombination. In many instances,such donor nucleic molecules will typically have terminal regions ofhomology to the target locus being modified with a central regioncontaining nucleic acid for insertion at or substitution of nucleic acidat the target locus. In many instances, terminal homologous regions withbe between 30 and 50 bases (e.g., from about 30 to about 45, from about35 to about 45, from about 40 to about 45, from about 40 to about 48,etc.) in length with an intervening region of between 1 and 20 bases(e.g., one, two, three, four five, six, seven, etc.). Further, donornucleic acid molecules may be used that contain regions of homologydesigned to hybridize the spatially separated regions of a target locus.In many instances, this spatial separation will be less than about 20nucleotides. Homologous recombination using such donor nucleic acidmolecules would be expected to result in deletion of nucleic acid at thetarget locus.

The invention further includes compositions and methods for theinsertion and correction of single-nucleotide polymorphisms (SNPs). Insome instances, such methods involve the use of single-stranded donornucleic acid with terminal regions having homology to a target site inconjunction with CRISPR system components. Of course, double-strandeddonor nucleic acid may also be used for SNP insertion or correction.

Data in FIG. 27 shows that, under the employed conditions used,efficiency of homologous recombination varies with the presence orabsence of terminal chemical modifications to the donor nucleic acid.Phosphorothioate modifications were used at both termini of the modifieddonor molecules used to generate the data in FIG. 27. While not wishingto be bound by theory, one possibility is that the terminal chemicalmodifications used protect the donor nucleic acid from nucleasedigestion.

The invention thus includes compositions containing and methodsemploying donor nucleic acid molecules having chemical modifications. Inmany instances, these chemical modifications will render nucleic acidmolecules containing them resistant to one or more nuclease (e.g.,exonuclease and/or endonuclease). Chemical modifications that may beused in the practice of the invention include the following:Phosphorothioate groups, 5′ blocking groups (e.g., 5′ diguanosine caps),3′ blocking groups, 2′-fluoro nucleosides,2′-O-methyl-3′phosphorothioate, or 2′-O-methyl-3′thioPACE, inverted dT,inverted ddT, and biotin. Further, a phosphoramidite C3 Spacer can beincorporated internally, or at either end of an oligo to introduce along hydrophilic spacer arm for the attachment of fluorophores or othergroups and can also be used to inhibit degradation by 3′ exonucleases.

In some instances, the terminal base at one or each end of a donor DNAmolecule will be chemically modified. In other instances, terminal twoor three bases at one or each end will be chemically modified. In stillother instances, internal bases will be chemically modified. In someinstances, from about 1% to about 50% (e.g., from about 1% to about 45%,from about 1% to about 40%, from about 1% to about 35%, from about 1% toabout 25%, from about 1% to about 15%, from about 5% to about 50%, fromabout 10% to about 50%, from about 15% to about 50%, from about 15% toabout 35%, etc.) of the total number of bases present in donor nucleicacid molecules will be chemically modified.

FIG. 29 shows data generated using a series of electroporationconditions. It has been found that Cas9 RNP uptake is robust in somecell types (data not shown) but donor DNA uptake conditions often needto be adjusted with particular cell types in order to achieve efficientuptake. Further, it is believed that once efficient uptake conditionsare identified for a particular cell type, those condition show lowlevels of variation when different donor DNA molecules are used. Thus,when electroporation is employed, one of the main factors for adjust ofconditions for achieving efficient genetic modification is the selectionof efficient condition of introduction of donor DNA into the particularcell being used. This is especially the case when large donor DNAmolecules are used.

A number of compositions and methods may be used to form CRISPRcomplexes. For example, Cas9 mRNA and a guide RNA may be encapsulated inINVIVOFECTAMINE™ for, for example, later in vivo and in vitro deliveryas follows. Cas9 mRNA is mixed (e.g., at a concentration of at 0.6mg/ml) with guide RNA. The resulting mRNA/gRNA solution may be used asis or after addition of a diluents and then mixed with an equal volumeof INVIVOFECTAmiNE™ and incubated at 50° C. for 30 min. The mixture isthen dialyzed using a 50 kDa molecular weight curt off for 2 hours in1×PBS, pH7.4. The resulting dialyzed sample containing the formulatedmRNA/gRNA is diluted to the desire concentration and applied directly oncells in vitro or inject tail vein or intraperitoneal for in vivodelivery. The formulated mRNA/gRNA is stable and can be stored at 4° C.

For Cas9 mRNA transfection with cell culture such as 293 cells, 0.5 μgmRNA was added to 25 μl of Opti-MEM, followed by addition of 50-100 nggRNA. Meanwhile, two μl of LIPOFECTAMINE™ 3000 or RNAiMax was dilutedinto 25 μl of Opti-MEM and then mixed with mRNA/gRNA sample. The mixturewas incubated for 15 minutes prior to addition to the cells.

A CRISPR system activity may comprise expression of a reporter (e.g.,green fluorescent protein, β-lactamase, luciferase, etc.) or nucleicacid cleavage activity. Using nucleic acid cleavage activity forpurposes of illustration, total nucleic acid can be isolated from cellsto be tested for CRISPR system activity and then analyzed for the amountof nucleic acid that has been cut at the target locus. If the cell isdiploid and both alleles contain target loci, then the data will oftenreflect two cut sites per cell. CRISPR systems can be designed to cutmultiple target sites (e.g., two, three four, five, etc.) in a haploidtarget cell genome. Such methods can be used to, in effect, “amplify”the data for enhancement of CRISPR system component introduction intocells (e.g., specific cell types). Conditions may be enhanced such thatgreater than 50% of the total target loci in cells exposed to CRISPRsystem components (e.g., one or more of the following: Cas9 protein,Cas9 mRNA, crRNA, tracrRNA, guide RNA, complexed Cas9/guide RNA, etc.)are cleaved. In many instances, conditions may be adjusted so thatgreater than 60% (e.g., greater than 70%, greater than 80%, greater than85%, greater than 90%, greater than 95%, from about 50% to about 99%,from about 60% to about 99%, from about 65% to about 99%, from about 70%to about 99%, from about 75% to about 99%, from about 80% to about 99%,from about 85% to about 99%, from about 90% to about 99%, from about 95%to about 99%, etc.) of the total target loci are cleaved.

Any number of conditions may be altered to enhance the introduction ofCRISPR system components into cells. Exemplary incubation conditions arepH, ionic strength, cell type, energy charge of the cells, the specificCRISPR system components present, the ratio of CRISPR system components(when more than one CRISPR system component is present), the CRISPRsystem component/cell ratio, concentration of cells and CRISPR systemcomponents, incubation times, etc.

One factor that may be varied, especially when CRISPR complexes areformed, is ionic strength. Ionic strength is the total ion concentrationin solution. CRISPR complexes are formed from the association of CRISPRprotein with CRISPR RNA and this association is partially dependent uponthe ionic strength of the surrounding environment. One method forcalculating the ionic strength of a solution is by the Debye and Huckelformula. In many instances, the ionic strength of solutions used in thepractice of the invention will be from about 0.001 to about 3 (e.g.,from about 0.001 to about 2, from about 0.001 to about 1.5, from about0.001 to about 1, from about 0.001 to about 0.7, from about 0.001 toabout 0.5, from about 0.001 to about 0.25, from about 0.001 to about0.1, from about 0.01 to about 1, from about 0.01 to about 0.5, fromabout 0.01 to about 0.2, from about 0.01 to about 0.1, etc.).

pH is another factor that may affect transfection efficiency. Typically,complexation and/or transfection will occur at near physiological pH(e.g., pHs from about 6.5 to about 7.5, pHs from about 6.8 to about 7.5,pHs from about 6.9 to about 7.5, pHs from about 6.5 to about 7.3, pHsfrom about 6.5 to about 7.1, pHs from about 6.8 to about 7.2, etc.). Insome instances, transfection efficiency is known to be sensitive tosmall variations in pH (e.g., =/−0.2 pH units).

The ratio of CRISPR system components to each other and to other mixturecomponents (e.g., cells) also affects the efficiency of CRISPR systemcomponent cellular update. Using Cas9 protein and guide RNA for purposesof illustration, Cas9 protein may be complexed with guide RNA beforecontact with a cell or simultaneously with cellular contact. In manyinstances, CRISPR protein and CRISPR RNA components will be present inset ratios (e.g., 1:1, 1.5:1, 2:1, 2.5:1, 3:1, 1:1.5, 1:2, 1:2.5, 1:3,from about 0.2:1 to about 4:1, from about 0.2:1 to about 3:1, from about0.2:1 to about 2:1, from about 0.5:1 to about 6:1, from about 0.5:1 toabout 4:1, etc.). One useful ratio for Cas9 protein to guide RNA is 1:1,where each Cas9 protein has available to it one guide RNA molecularpartner for complex formation.

The uptake of CRISPR complexes by cells is partially determined by theconcentration of the CRISPR complexes and the cell density and the ratioof the CRISPR complexes to the cells. Typically, high CRISPR complexconcentrations will result in higher amounts of uptake by availablecells. Exemplary CRISPR complex/cell density conditions include 10⁷CRISPR complexes per cell. Additionally, CRISPR complexes per cell maybe in the range of 10² to 10¹² complexes per cell (e.g., from about 10²to about 10¹¹, from about 10² to about 10¹⁰, from about 10² to about10⁹, from about 10² to about 10⁸, from about 10² to about 10⁷, fromabout 10² to about 10⁶, from about 10³ to about 10¹², from about 10⁴ toabout 10¹², from about 10⁵ to about 10¹², from about 10⁶ to about 10¹²,from about 10⁷ to about 10¹², from about 10⁸ to about 10¹², from about10³ to about 10¹⁰, from about 10⁴ to about 10¹⁰, from about 10⁵ to about10¹¹, etc.). Also, the cell density will typically be about 10⁵ cellsper ml. Typically, cell density will be in the range of 10² to 10⁸ cellsper ml (e.g., from about 10² to about 10⁷, from about 10² to about 10⁶,from about 10² to about 10⁵, from about 10² to about 10⁴, from about 10³to about 10⁸, from about 10³ to about 10⁷, from about 10⁴ to about 10⁷,etc.).

The invention includes methods in which one or both of the CRISPRcomplex/cell density and/or the total cell density are adjusted suchthat, when double-stranded target locus cutting is assayed, thepercentage of target loci cut are between 80 and 99.9% (e.g., from about80% to about 99%, from about 85% to about 99%, from about 90% to about99%, from about 95% to about 99%, from about 96% to about 99%, fromabout 80% to about 95%, from about 90% to about 97%, etc.).

One exemplary set of conditions that may be use is where ˜5⁵ cells arecontacted with 500 ng of Cas9 (˜2¹² molecules) complexed with targetlocus specific guide RNA.

The invention also includes compositions and methods for storing reagentfor intracellular genetic modification. FIGS. 30, 31, and 32 show theresult of three month stability testing of CRISPR complexes stored at 4°C. and frozen. Data set out in FIG. 30 was generated using gRNA and Cas9protein alone. Data set out in FIG. 31 was generated using gRNA, Cas9protein and OPTI-MEM® culture medium. Data set out in FIG. 32 wasgenerated using Cas9 protein and LIPOFECTAMINE® RNAiMax transfectionreagent. In all cases, high levels of functional activity were retainedfor three months with freezing. Similar results were observed with Cas9protein and OPTI-MEM® culture medium. Cas9 protein and LIPOFECTAMINE®RNAiMax transfection reagent data shows that functional activity appearsto drop off relatively quickly at 4° C.

Data shown in FIGS. 30, 31, and 32 each show that CRISPR systemcomponent reagents are stable for a minimum of six months. This isparticularly useful for high-throughput applications. The invention thusincludes high-throughput reagents containing CRISPR system components.In some embodiments, such components comprise one or more of thefollowing: one or more gRNA, one or more Cas9 protein, one or more cellculture medium (e.g., one or more mammalian cell culture medium), one ormore transfection reagent, and one or more donor nucleic acid molecule.

For purposes of illustration, the invention includes multi-well plates,as well as high throughput methods employing such plates, in whichdifferent wells contain Cas9 protein and a transfection reagent.Further, different wells contain different gRNA molecules. Such platesmay be used in high throughput methods for altering multiple geneticsites within cells. Each well may further contain, for example, donorDNA with termini homologous to the gRNA directed cleavage site foralteration of different loci within cells.

The invention also includes CRISPR system reagents that remain stablewhen stored for specified periods of time. For purposes of illustration,the invention provides CRISPR system reagents that retain at least 75%(e.g., from about 75% to about 100%, from about 80% to about 100%, fromabout 85% to about 100%, from about 90% to about 100%, from about 95% toabout 100%, from about 75% to about 90%, from about 80% to about 90%,etc.) of their original CRISPR related activity after 3 months ofstorage at −20° C. Of course, CRISPR system reagents may be stored atdifferent temperatures (e.g., 4° C., −20° C., −70° C., from about 4° C.to about −70° C., from about −20° C. to about −70° C., etc.). Further,the invention also includes CRISPR system reagents and method forstoring such reagents where at least 75% of their original CRISPRrelated activity after up to 1 year (e.g., from about 1 month to about12 months, from about 2 months to about 12 months, from about 3 monthsto about 12 months, from about 4 months to about 12 months, from about 5months to about 12 months, from about 1 months to about 9 months, fromabout 3 months to about 9 months, from about 2 months to about 6 months,etc.).

In some instances, CRISPR complexes may not be stable during storage,especially under certain conditions. For example, under some conditionsCas9, gRNA and transfection reagents may be stable under one set ofconditions but not under another set of conditions. It has beendetermined that under some conditions (e.g., in certain bufferformulations), Cas9, gRNA and transfection reagent mixtures are notstable upon freezing but are stable upon storage at 4° C. The inventionthis includes compositions that are stable under on set of storageconditions but not another set of storage conditions.

The data set out in FIGS. 30, 31, and 32 were generated using specificconditions. RNA was prepared in water but EDTA (e.g., 0.1 mM) and/orsodium acetate buffer may be used. Cas9 protein was prepared in 15 mMTris HCl, 250 mM NaCl, 0.6 mM TCEP, 50% glycerol, pH 8.

Storage data was generated using reagent mixtures contained in wells ofmultiwall plates. Cas9 was present in wells in an amount of 500 ng/well(0.5 μl of a 0.5 μg/μl stock solution) and gRNA was present in an amountof 200 ng/well (0.7 μl of a 300 ng/μl stock solution). All reagents werestored as 4× solutions. Cas9/gRNA samples were placed under storageconditions as 1.2 μl aliquots in each well. Cas9/gRNA/OPTI-MEM® sampleswere placed under storage conditions as 20 μl aliquots in each well with18.8 μl of OPTI-MEM® being present in each well. Cas9/OPTI-MEM® sampleswere placed under storage conditions as 10 μl aliquots in each well with9.5 μl of OPTI-MEM® being present in each well. Cas9/RNAiMax sampleswere placed under storage conditions as 6.5 μl aliquots in each wellwith 6 μl of RNAiMax transfection reagent being present in each well.

The above reagents were then used after storage in cleavage assay afterbeing combined with additional reagents and cells. The data set out inFIG. 30 was generated by combining the 1.2 μl of Cas9 protein and gRNAwith 18.8 μl OPTI-MEM® which was incubated at room temperature for 5minutes. 6 μl of LIPOFECTAMINE® RNAiMax was also mixed with 14 μl ofOPTI-MEM® which was incubated at room temperature for 5 minutes. Thesetwo mixtures were then combined and incubated at room temperature for 5minutes then contacted with cells. Cas9/gRNA/OPTI-MEM® samples (FIG. 31)were combined with 6 μl LIPOFECTAMINE® RNAiMax and 14 μl OPTI-MEM® thathad been incubated at room temperature for 5 minutes. These two mixtureswere then combined and incubated at room temperature for 5 minutes thencontacted with cells. Cas9/OPTI-MEM® samples were mixed with both 0.7 μlor gRNA and 9.3 μl of OPTI-MEM® (incubated for 5 minutes at roomtemperature) and 6 μl LIPOFECTAMINE® RNAiMax and 14 μl OPTI-MEM®incubated for 5 minutes at room temperature), then contacted with cellsafter incubation for 5 minutes at room temperature. Cas9/LIPOFECTAMINE®RNAiMax samples (FIG. 32) were mixed with 0.7 μl gRNA and 32.8 μlOPTI-MEM® (incubated at room temperature for 5 minutes), then contactedwith cells.

For transfection, 293FT cells were seeded one day prior to transfectionat 20,000 cells per well in a 96 well plate format to get around 50% to60% cell confluency on the day of transfection. Each well at the time ofseeding has 100 μl of cell culture media (DMEM, 10% FBS, and 5% each ofsodium pyruvate, non-essential amino acids and GlutaMAX™). At the timeof transfection 10 μl of final transfection mix (containing Cas9, gRNA,LIPOFECTAMINE® RNAiMAX and OPTIMEM®) was added to each well in 96 wellformat. Following incubation at 37° C. for 72 hours the cells wereharvested for measuring % cleavage efficiency at the respective targetloci (in this case HPRT gene target) using GENEART™ cleavage detectionassay.

In one aspect, the invention relates to compositions and methods relatedto ready to use reagents. A ready to use reagent may be in any number offorms. For example, a ready to use reagent may contain one or more Cas9protein, one or more gRNA, one or more transfection reagent, and one ormore cell culture medium. As specific example is a reagent that containsa Cas9 protein, two gRNAs, and LIPOFECTAMINE® RNAiMax all in 2×concentration and OPTI-MEM® culture medium in a 1× concentration. Aready to use reagent of this type may be mixed 1:1 with cells containedin OPTI-MEM® culture medium to yield a transfection reaction mixture forthe introduction of two gRNAs into the cells, where the gRNAs sharesequence homology with two locations in the genome of the cells. Ifappropriate, the cells may simultaneously or subsequently be contactedwith one or more nucleic acid molecules for insertion into the genomiccut sites.

Another example of a ready to use reagent includes a combination of oneor more Cas9 protein, one or more gRNA, and one or more cell culturemedium. As specific example is a reagent that contains a Cas9 proteinand two gRNAs in 2× concentration and OPTI-MEM® culture medium in a 1×concentration. A ready to use reagent of this type may be mixed firstwith LIPOFECTAMINE® RNAiMax and then 1:1 with cells contained inOPTI-MEM® culture medium to yield a transfection reaction mixture forthe introduction of two gRNAs into the cells.

Ready to use reagents such as those set out above may be stored at 4° C.for a period of time prior to use. As noted elsewhere herein, under someconditions, Cas9, gRNA and transfection reagent mixtures are not stableupon freezing but are stable upon storage at 4° C.

Ready to use reagents may be labeled with preferred storage conditionsand expiration dates that are designed to reflect a specified decreasein activity (e.g., less than 80% of activity). For example, expirationdates may range from about two weeks to about one year (e.g., from abouttwo weeks to about ten months, from about two weeks to about eightmonths, from about two weeks to about six months, from about two weeksto about four months, from about one month to about one year, from aboutone month to about ten months, from about one month to about six months,from about one month to about four months, from about three months toabout one year, from about three months to about eight months, etc.).

It has also been found that, in some instances, higher concentrations ofCRISPR system components result in higher stability upon storage. Thus,in some aspects, the invention includes reagents that contain greaterthan 50 ng/μl (e.g., from about 50 ng/μl to about 500 ng/μl, from about100 ng/μl to about 500 ng/μl, from about 150 ng/μl to about 500 ng/μl,from about 200 ng/μl to about 500 ng/μl, from about 250 ng/μl to about500 ng/μl, from about 300 ng/μl to about 500 ng/μl, from about 400 ng/μlto about 500 ng/μl, etc.) of gRNA. In many instances, the molar amountof Cas9 protein, when present, to gRNA will be in the range of fromabout 5:1 to about 1:5 (e.g., from about 5:1 to about 1:4, from about5:1 to about 1:3, from about 5:1 to about 1:2, from about 5:1 to about1:1, from about 5:1 to about 1:1, from about 4:1 to about 1:5, fromabout 5:1 to about 1:5, from about 2:1 to about 1:5, from about 1:2 toabout 1:5, from about 4:1 to about 1:4, from about 3:1 to about 1:3,from about 2:1 to about 1:2, etc.).

Kits:

The invention also provides kits for, in part, the assembly and/orstorage of nucleic acid molecules and for the editing of cellulargenomes. As part of these kits, materials and instruction are providedfor both the assembly of nucleic acid molecules and the preparation ofreaction mixtures for storage and use of kit components.

Kits of the invention will often contain one or more of the followingcomponents:

1. One or more nucleic acid molecule (e.g., one or more primer, one ormore DNA molecule encoding Cas9, dCas9, guide RNA, etc., one or moremRNA encoding a CRISPR system component, such as Cas9, dCas9, etc.),

2. One or more polymerase,

3. One or more protein (e.g., one or more CRISPR protein such as Cas9,dCas9, etc.),

4. One or more partial vector (e.g., one or more nucleic acid segmentcontaining an origin of replication and/or a selectable marker) orcomplete vector, and

5. Instructions for how to use kits components.

In particular, some kits of the invention may include one or more of thefollowing: (a) a double-stranded nucleic acid molecule encoding the 3′end of a guide RNA molecule (see FIG. 8), wherein this double-strandednucleic acid molecule does not encode all or part of HybridizationRegion 1, (b) a polymerase, and (c) at least one buffer.

In some embodiments, kits may comprise one or more reagents for use in aprocess utilizing one or more of the CRISPR system components discussedherein or for producing one or more CRISPR system component discussedherein.

Kit reagents may be provided in any suitable container. A kit mayprovide, for example, one or more reaction or storage buffers. Reagentsmay be provided in a form that is usable in a particular reaction, or ina form that requires addition of one or more other components before use(e.g., in concentrate or lyophilized form). A buffer can be any buffer,including but not limited to a sodium carbonate buffer, a sodiumbicarbonate buffer, a borate buffer, a Tris buffer, a MOPS buffer, aHEPES buffer, and combinations thereof. In some embodiments, the bufferis alkaline. In some embodiments, the buffer has a pH from about 7 toabout 10.

EXAMPLES Example 1: One Step Synthesis of gRNA Template and HighEfficiency Cell Engineering Workflow

Abstract

CRISPR-Cas9 systems provide innovative applications in genomeengineering. To edit the genome, expression of Cas9, mature crRNA andtracrRNA or a single guide RNA (gRNA) is required. Elements of themature crRNA and tracrRNA or a gRNA are often built into a Cas9expression plasmid or constructed in a standard plasmid driven by a U6promoter for mammalian expression. A novel method for the rapidsynthesis of gRNA template is described in this example, which combinesgene synthesis and DNA fragment assembly technologies with an accuracyof assembly of >96%. In other words, over 96% of the assembled nucleicacid molecules are the desired assembly products. The method allowsrapid synthesis of guide RNA (gRNA) via in vitro transcription usingshort DNA oligonucleotides. In conjunction with Cas9 protein delivery,Cas9/gRNA complexes can be transfected into the cells through processessuch as lipid-mediated methods, electroporation, and cell penetratingpeptide mediated delivery. Overall, cell engineering workflows can bereduced to at least four days and, in some instances, two days. Methodsdescribed herein are applicable for high throughput gRNA synthesis andgenome-wide editing.

Introduction

CRISPR-Cas9 mediated genome engineering enables researchers to modifygenomic DNA in vivo directly and efficiently. Three components (Cas9,mature crRNA and tracrRNA) are essential for efficient cell engineering.Although the mature crRNA and tracrRNA can be synthesized chemically,the quality of the synthetic RNA is often not sufficient for in vivocell engineering due, for example, to the presence of truncatedby-products. Thus, mature crRNA and tracrRNA or a combined single gRNAare often transcribed from a Cas9 expression plasmid or built into aseparate plasmid driven by a U6 promoter. The resulting plasmids arethen transfected or co-transfected into the cells. Because theconstructs are relatively large, the delivery of plasmid DNA oftenbecomes the limiting step, especially for suspension cells. RecentlyCas9 mRNA has employed to increase the rate of DNA cleavage. To makegRNA, a pre-cloned all-in-one plasmid based upon, for example, a vectorshown in FIG. 17 or FIG. 18 may serve as template to prepare a gRNA PCRfragment containing a T7 promoter, followed by gel extraction.Alternatively a synthetic DNA string may be used as a template.

Overall, it is time-consuming to prepare the gRNA template for in vitrotranscription. A gRNA template can be assembled via PCR in about onehour. Further, gRNA can be generated in vitro transcription in about 3hours. DNA oligonucleotides can be converted to into gRNA in about 4hours. A workflow with the above timing elements was tested and.Furthermore, in combination with Cas9 protein transfection technology,cell engineering cycle was accomplished as described herein in fourdays.

Materials and Methods

Materials

293FT cells, DMEM medium, Fetal Bovine Serum (FBS), OPTI-MEM® Medium,LIPOFECTAMINE® 3000, RNAIMAX™, MESSENGERMAX™, GENEART® CRISPR NucleaseVector with OFP Reporter, 2% E-GEL® EX Agarose Gels, PURELINK® PCR MicroKit, TranscriptAid T7 High Yield Transcription Kit, MEGASHORTSCRIPT™ T7Transcription Kit, MEGACLEAR™ Transcription Clean-Up Kit, ZERO BLUNT®TOPO® PCR Cloning Kit, PURELINK® Pro Quick96 Plasmid Purification Kit,Qubit® RNA BR Assay Kit, QUBIT® Protein Assay Kit, Pierce LALChromogenic Endotoxin Quantitation Kit, GENEART® Genomic CleavageDetection Kit, and POROS® Heparin column were from Thermo FisherScientific. PHUSION® High-Fidelity DNA Polymerase was purchased from NewEngland Biolabs. HIPREP™ 16/60 Sephacryl S-300 HR gel filtration columnwas purchased from GE Healthcare. All the DNA oligonucleotides used forgRNA synthesis were from Thermo Fisher Scientific.

Methods

One step synthesis of gRNA template

The design of oligonucleotides for the synthesis of gRNA template isdepicted in FIG. 13. The forward primer:

5′-GTT TTA GAG CTA GAA ATA GCA AG-3′ (SEQ ID NO: 13) and reverse primer:

5′-AAA AGC ACC GAC TCG GTG CCA C-3′ (SEQ ID NO: 14) were used to amplifythe 80 bp constant region of tracrRNA from a GENEART® CRISPR NucleaseVector, followed by purification using agarose gel extraction. Theconcentration of PCR product was measured by Nanodrop (Thermo FisherScientific) and the molarity was calculated based on the molecularweight of 24.8 kd. To prepare a pool of oligonucleotides, an aliquot ofthe 80 bp PCR product was mixed with two end primers5′-taatacgactcactatagg-3′ (SEQ ID NO: 15) and 5′-AAA AGC ACC GAC TCG GTGCCA C-3′ (SEQ ID NO: 14) with a final concentration of 0.3 μM for the 80bp PCR product and 10 μM for each of the end primers. For a specifictarget, a 34 bp forward primer consisting of the 19 bp T7 promotersequence taatacgactcactatagg (SEQ ID NO: 15) and 15 bp of the 5′endtarget sequence, and a 34 bp reverse primer consisting of 20 bp targetsequence and 14 bp of the 5′ end tracrRNA sequence gttttagagctaga (SEQID NO: 16) were chemically synthesized with 15 bp overlap. A workingsolution containing the two 34 bp oligonucleotides was prepared at afinal concentration of 0.3 μM. Alternatively, a pair of 39 bp forwardand reverse primers with 20 bp overlap was synthesized and tested. Toset up the one step synthesis of gRNA template, 1.5 μl of poololigonucleotides and 1.5 μl of the working solution were added to a PCRtube containing 10 μl of 5× Phusion HF buffer, 1 μl of 10 mM dNTP, 35.5μl H2O, and 0.5 μl PHUSION® High-Fidelity DNA polymerase. The PCRprogram was set at 98° C. for 30 sec and then 30 cycles of 98° C. for 5sec and 55° C. for sec, followed by incubation at 72° C. for 30 sec and4° C. forever. The PCR product was analyzed by a 2% E-GEL® EX AgaroseGel, followed by purification using Purelink PCR micro column. The DNAconcentration was determined by Nanodrop instrument.

To determine the error rate, the PCR product was cloned into ZERO BLUNT®TOPO® vector, followed by plasmid DNA isolation and sequencing.

In Vitro Transcription

The in vitro transcription of gRNA template was carried out usingTRANSCRIPTAID™ T7 High Yield Transcription Kit. Briefly, 6 μl of gRNAtemplate (250-500 ng) was added to a reaction mixture containing 8 μl ofNTP, 4 μl of 5× reaction buffer and 2 μl of T7 enzyme mix. The reactionwas carried out at 37° C. for 2 hrs, followed by incubation with DNase I(2 units per 120 ng DNA template) for 15 minutes. The gRNA product waspurified using MEGACLEAR™ Transcription Clean-Up kit as described in themanual. The concentration of RNA was determined using QUBIT® RNA BRAssay Kit.

Expression and Purification of Cas9 Protein

A glycerol stock BL21(DE3) star E. coli strain expressing NLS Cas9protein was inoculated in 20 ml BRM medium and grown overnight at 37° C.in a shaking incubator. The overnight culture was then added to 1 literof BRM medium and grown cells to an OD₆₀₀ nm of 0.6-0.8 at 37° C. in ashaking incubator (˜4-5 hours). An aliquot of un-induced sample wastaken for monitoring protein induction with IPTG. 0.5 ml of 1 M IPTG wasadded to the culture and incubated overnight at room temperature in ashaking incubator. An aliquot of induced sample along with un-inducedsample were analyzed by SDS-PAGE. Upon validation of protein induction,the culture medium was centrifuged at 5000 rpm for 15 minutes to harvestthe cell pellets (˜24 grams of wet weight). 100 ml of buffer Acontaining 20 mM Tris (pH7.5), 100 mM NaCl, 10% Glycerol, and 1 mM PMSFwas used to resuspend the cell pellet. The cell suspension was sonicatedon ice for 30 minutes with power level of medium tip set at 8, 10 sec“on”, and 20 sec “off”. The cell lysate was clarified by centrifugationat 16500 rpm for 30 minutes. The supernatant was filtered through a 0.2μm filter device prior to loading to a 16 ml heparin column previouslyequilibrated with buffer A at a flow rate of 2 ml/min. The column wasfirst washed with five column volume of buffer A and then graduallyincreased to 40% of buffer B containing 20 mM Tris (pH7.5), 1.2 M NaCland 10% glycerol. The Cas9 protein was eluted with a 5 CV gradient from40% to 100% buffer B. The fractions were analyzed by SDS-PAGE. Fractionscontaining Cas9 protein were combined and concentrated using two 15 mlAmicon Centrifugal filter units (EMD Millipore, Cat. No. UFC905024). Theconcentrated protein was filtered through a 0.2 μm filter device andloaded twice onto a 120 ml of HIPREP™ 16/60 Sephacryl S-300 HR columnpreviously equilibrated with buffer C containing 20 mM Tris (pH 8), 250mM KCl and 10% Glycerol. The fractions containing Cas9 protein werepooled and concentrated. The protein concentration was determined byQUBIT® Protein Assay Kit. The endotoxin level in the purified proteinwas measured by Endotoxin Quantitation Kit. The concentrated protein wasadjusted to 50% glycerol and stored at −20° C.

Cell Culture

293FT cells were maintained in DMEM medium supplemented with 10% FBS ina 5% CO₂ incubator. One day prior to transfection, the cells were seededin a 24-well plate at a cell density of 2.5×10⁵ cells/0.5 ml medium. Fortransfection, 500 ng of purified Cas9 was added to 25 μl of OPTI-MEM®medium, followed by addition of 120 ng gRNA. The sample was mixed bygently tapping the tubes a few times and then incubated at roomtemperature for 10 minutes. To a separate test tube, 3 μl of RNAIMAX™was added to 25 μl of OPTI-MEM® medium. The diluted transfection reagentwas transferred to the tube containing Cas9 protein/gRNA complexes,followed by incubation at room temperature for 15 minutes. The entiresolution was then added to the cells in a 24-well and mixed by gentlyswirling the plate a few times. The plate was incubated at 37° C. for 48hours in a 5% CO₂ incubator. The percentage of genome editing wasmeasured by GENEART® Genomic Cleavage Detection Kit.

Results and Discussion

One Step Synthesis of gRNA Template

Since gRNA synthesis is one of the limiting steps in genome engineering,an attempt was made to reduce the time for gRNA synthesis. As anexample, HPRT-T1 target catttctcagtcctaaaca (SEQ ID NO: 17) was chosen,but these methods were also found to work for GFP and VEGFA-T3 targets(data not shown). Initially, a gene synthesis approach was utilized toassemble a gRNA template using a set of 6 synthetic DNA oligonucleotides(Set 1 oligonucleotides in Table 8). Through optimization ofoligonucleotide pool concentration and PCR condition, a clean PCRproduct was obtained on an agarose gel (data not shown). An aliquot ofthe PCR product served as template to synthesize the gRNA via in vitrotranscription. The quality of synthetic gRNA was analyzed by adenaturing gel.

To test the functionality of synthetic gRNA, gRNA was associated withCas9 protein. The resulting complexes were then delivered to the 293FTcells via lipid-based transfection. However, the evaluation of in vivogenome cleavage assay indicated that gRNA did not work well (data notshown). To determine the problem, gRNA templates were cloned into a ZEROBLUNt® TOPO vector and then sequenced. As shown in FIG. 15, it wasobserved that more than 20% of the gRNA templates harbored mutations,mostly deletions. To minimize the potential sources of errors, insteadof using long synthetic oligos to create the complete T7 promoter/guideRNA template, the constant 80 bp tracrRNA region was amplified from asequence-validated plasmid template, followed by gel purification toremove the template. Then to fuse the T7 promoter sequence and targetsequence to the constant tracrRNA, a pair of 34 bp or 39 bp forward andreverse oligonucleotides that share 15 bp or 20 bp homology,respectively, across the variable target sequence were designed, whereinthe middle oligo 2 also shared 19 bases of homology with the tracrRNAregion (FIG. 13).

As described in Materials and Methods, the gRNA template was assembledin a single PCR reaction using a pool of DNA oligonucleotides andtracrRNA fragment (FIG. 13). Upon PCR micro column purification, thegRNA template was used to prepare gRNA via in vitro transcription. Thequality of gRNA was examined using a TBE-urea denaturing gel. gRNAprepared via PCR amplification from an all-in-one plasmid was used aspositive control. An “all-in-one plasmid” is a plasmid that contains allof the components of a CRISPR system, such as guide RNA and Cas9 codingsequences. Vector maps of two similar all-in-one plasmids used in theexperiments here are shown in FIG. 17 and FIG. 18.

To examine in vivo functionality of synthesized gRNA, the Cas9 proteinfrom E. coli was expressed and purified. The Cas9 protein waspre-incubated with synthetic gRNA to form the complexes prior to celltransfection. The gRNA prepared from an all-in-one plasmid served as apositive control. The genome modification was examined by GenomeCleavage and Detection assay. As depicted in Gel Image B of FIG. 14, thepercentage of Indel for the newly synthesized gRNAs was similar to thepositive control. To determine the error rate in the gRNA template,sequencing analysis was also performed. As depicted in FIG. 15,approximately 7% of gRNA template harbored mutation with most deletionoccurred at 3′ end and 5′ end. One mutation was detected within thetarget region when longer 39 bp oligonucleotides were used. The use ofPAGE-purified end primers (˜20 bp each) further decreased the error rateto 3.6% with no mutation detected in the target region, which wassimilar to the control gRNA prepared from an all-in-one plasmid with a2% error rate. These results indicated that the quality of gRNA weregood enough for most of our applications.

Because the 80 bp tracrRNA contains a polyT at the 3′ end, there was apossibility that the Poly T had no effect on genome editing. To testthis, serial deletions of PolyT at the 3′ end of gRNA (set 3 oligos inTable 8) were made. Based on in vivo genome cleavage assay, removal ofthe PolyT at 3′ end of gRNA appeared to have no effect on theperformance of gRNA. The addition of three extra Ts at the 3′ end alsodid not affect the functionality of gRNA either (data not shown).

The standard T7 promoter sequence “taatacgactcactataggg” (SEQ ID NO: 18)contains GGG at the 3′ end, which is thought to be essential for maximalproduction of gRNA via in vitro transcription. However, because thetranscription starts from the first G, three extra G will be added tothe gRNA sequence assuming the target does not have a G at the 5′ end,which might affect the functionality of gRNA. To examine this, the AAVStarget ccagtagccagccccgtcc (SEQ ID NO: 19) and the IP3R2 targettcgtgtccctgtacgcgga (SEQ ID NO: 20) were chosen and G deletions at the3′ end of T7 promoter were made (see FIG. 16 and Table 7). The additionof Gs, especially 3G in a row, significantly decreased the activity ofgRNA. The addition of 1 G exhibited slightly better cleavage efficiencythan that of 2 G, even although both produced similar amount of gRNA inin vitro transcription reaction. However, without any G, the yield of invitro transcription reaction was dramatically reduced.

TABLE 7 Effect of G on gRNA synthesis Construct PCR yield (ng/μl) RNAyield (ng/μl) AAVS-0G 124 220 AAVS-1G 82 394 AAVS-2G 70 294 AAVS-3G 53290 IP3R2-0G 100 65 IP3R2-1G 54 272 IP3R2-2G 46 300 IP3R2-3G 122 289EMX1-0G 75 156 EMX1-1G 119 490 EMX1-2G 101 680 EMX1-3G 98 750

In conclusion, compositions and methods provided herein related to gRNAsynthesis and associated workflows allow for four day cell engineering.On Day 1, the biologists (1) design and (2) synthesize or order shortDNA oligonucleotides and seed the cells of interest. On Day 2, thebiologists prepare the gRNA template by one pot PCR, followed by invitro transcription for making gRNA. Upon association of gRNA withpurified Cas9 protein, the Cas9 protein/gRNA complexes are transfectedinto the cells via lipid-mediated method or electroporation. On Day 4,the biologists harvest the cells to analyze genome modification. Thus,the invention provides compositions and methods related to improveworkflows for genome engineering. In some aspects, these workflows allowfor the genome modification experiments to occur in four days fromconcept to completion.

TABLE 8 Oligonucleotides for gRNA synthesis SEQ ID NO Set 1 oligos gF1taatacgactcactataggggcatttctagtcctaaaca 21 gR1GCT ATT TCT AGC TCT AAA ACT GTT TAG GAC TGA GAA ATG C 22 gF2gttttagagctagaaatagcaagttaaaataaggctagtccgttatcaacttgaaaaagt 23 gR2AAA AGC ACC GAC TCG GTG CCA CTT TTT CAA GTT GAT AAC GGA CTA 24GCC TTA TTT TAA CTT gEnd-F taatacgactcactataggg 18 gEnd-RAAA AGC ACC GAC TCG GTG CCA C 14 Set 2 oligos Con-FGTT TTA GAG CTA GAA ATA GCA AG 13 gEnd-R AAA AGC ACC GAC TCG GTG CCA C14 gR1-20 bp GCT ATT TCT AGC TCT AAA ACT GTT TAG GAC TGA GAA ATG 25gR1-15 bp TTC TAG CTC TAA AAC TGT TTA GGA CTG AGA AAT G 26 gF1-20 bptaatacgactcactataggcatttctcagtcctaaacag 27 gF1-15 bptaatacgactcactataggcatttctcagtccta 28 gEnd-F-2G taatacgactcactatagg 15Set 3 oligos gEnd-R4T AAA AGC ACC GAC TCG GTG CCA C 14 gEnd-R3TAA AGC ACC GAC TCG GTG CCA C 29 gEnd-R2T A AGC ACC GAC TCG GTG CCA C 30gEnd-R1T AGC ACC GAC TCG GTG CCA C 31 gEnd-R0T GC ACC GAC TCG GTG CCA C32 Set 4 oligos AAVS3G taatacgactcactatagggCCAGTAGCCAGCCCC 33 AAVS2GtaatacgactcactataggCCAGTAGCCAGCCCC 34 AAVS1GtaatacgactcactatagCCAGTAGCCAGCCCC 35

TABLE 9Nucleotide sequence of the vector shown in FIG. 17. (SEQ ID NO: 36)GTTAGGCGTTTTGCGCTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACCGTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCGGGACCGATCCAGCCTCCGGAGCGGCCGCCACCATGGGCAAGCCCATCCCTAACCCCCTGTTGGGGCTGGACAGCACCGCTCCCAAAAAGAAAAGGAAGGTGGGCATTCACGGCGTGCCTGCGGCCGACAAAAAGTACAGCATCGGCCTTGATATCGGCACCAATAGCGTGGGCTGGGCCGTTATCACAGACGAATACAAGGTACCCAGCAAGAAGTTCAAGGTGCTGGGGAATACAGACAGGCACTCTATCAAGAAAAACCTTATCGGGGCTCTGCTGTTTGACTCAGGCGAGACCGCCGAGGCCACCAGGTTGAAGAGGACCGCAAGGCGAAGGTACACCCGGAGGAAGAACAGGATCTGCTATCTGCAGGAGATCTTCAGCAACGAGATGGCCAAGGTGGACGACAGCTTCTTCCACAGGCTGGAGGAGAGCTTCCTTGTCGAGGAGGATAAGAAGCACGAACGACACCCCATCTTCGGCAACATAGTCGACGAGGTCGCTTATCACGAGAAGTACCCCACCATCTACCACCTGCGAAAGAAATTGGTGGATAGCACCGATAAAGCCGACTTGCGACTTATCTACTTGGCTCTGGCGCACATGATTAAGTTCAGGGGCCACTTCCTGATCGAGGGCGACCTTAACCCCGACAACAGTGACGTAGACAAATTGTTCATCCAGCTTGTACAGACCTATAACCAGCTGTTCGAGGAAAACCCTATTAACGCCAGCGGGGTGGATGCGAAGGCCATACTTAGCGCCAGGCTGAGCAAAAGCAGGCGCTTGGAGAACCTGATAGCCCAGCTGCCCGGTGAAAAGAAGAACGGCCTCTTCGGTAATCTGATTGCCCTGAGCCTGGGCCTGACCCCCAACTTCAAGAGCAACTTCGACCTGGCAGAAGATGCCAAGCTGCAGTTGAGTAAGGACACCTATGACGACGACTTGGACAATCTGCTCGCCCAAATCGGCGACCAGTACGCTGACCTGTTCCTCGCCGCCAAGAACCTTTCTGACGCAATCCTGCTTAGCGATATCCTTAGGGTGAACACAGAGATCACCAAGGCCCCCCTGAGCGCCAGCATGATCAAGAGGTACGACGAGCACCATCAGGACCTGACCCTTCTGAAGGCCCTGGTGAGGCAGCAACTGCCCGAGAAGTACAAGGAGATCTTTTTCGACCAGAGCAAGAACGGCTACGCCGGCTACATCGACGGCGGAGCCAGCCAAGAGGAGTTCTACAAGTTCATCAAGCCCATCCTGGAGAAGATGGATGGCACCGAGGAGCTGCTGGTGAAGCTGAACAGGGAAGATTTGCTCCGGAAGCAGAGGACCTTTGACAACGGTAGCATCCCCCACCAGATCCACCTGGGCGAGCTGCACGCAATACTGAGGCGACAGGAGGATTTCTACCCCTTCCTCAAGGACAATAGGGAGAAAATCGAAAAGATTCTGACCTTCAGGATCCCCTACTACGTGGGCCCTCTTGCCAGGGGCAACAGCCGATTCGCTTGGATGACAAGAAAGAGCGAGGAGACCATCACCCCCTGGAACTTCGAGGAAGTGGTGGACAAAGGAGCAAGCGCGCAGTCTTTCATCGAACGGATGACCAATTTCGACAAAAACCTGCCTAACGAGAAGGTGCTGCCCAAGCACAGCCTGCTTTACGAGTACTTCACCGTGTACAACGAGCTCACCAAGGTGAAATATGTGACCGAGGGCATGCGAAAACCCGCTTTCCTGAGCGGCGAGCAGAAGAAGGCCATCGTGGACCTGCTGTTCAAGACCAACAGGAAGGTGACCGTGAAGCAGCTGAAGGAGGACTACTTCAAGAAGATCGAGTGCTTTGATAGCGTGGAAATAAGCGGCGTGGAGGACAGGTTCAACGCCAGCCTGGGCACCTACCACGACTTGTTGAAGATAATCAAAGACAAGGATTTCCTGGATAATGAGGAGAACGAGGATATACTCGAGGACATCGTGCTGACTTTGACCCTGTTTGAGGACCGAGAGATGATTGAAGAAAGGCTCAAAACCTACGCCCACCTGTTCGACGACAAAGTGATGAAACAACTGAAGAGACGAAGATACACCGGCTGGGGCAGACTGTCCAGGAAGCTCATCAACGGCATTAGGGACAAGCAGAGCGGCAAGACCATCCTGGATTTCCTGAAGTCCGACGGCTTCGCCAACCGAAACTTCATGCAGCTGATTCACGATGACAGCTTGACCTTCAAGGAGGACATCCAGAAGGCCCAGGTTAGCGGCCAGGGCGACTCCCTGCACGAACATATTGCAAACCTGGCAGGCTCCCCTGCGATCAAGAAGGGCATACTGCAGACCGTTAAGGTTGTGGACGAATTGGTCAAGGTCATGGGCAGGCACAAGCCCGAAAACATAGTTATAGAGATGGCCAGAGAGAACCAGACCACCCAAAAGGGCCAGAAGAACAGCCGGGAGCGCATGAAAAGGATCGAGGAGGGTATCAAGGAACTCGGAAGCCAGATCCTCAAAGAGCACCCCGTGGAGAATACCCAGCTCCAGAACGAGAAGCTGTACCTGTACTACCTGCAGAACGGCAGGGACATGTACGTTGACCAGGAGTTGGACATCAACAGGCTTTCAGACTATGACGTGGATCACATAGTGCCCCAGAGCTTTCTTAAAGACGATAGCATCGACAACAAGGTCCTGACCCGCTCCGACAAAAACAGGGGCAAAAGCGACAACGTGCCAAGCGAAGAGGTGGTTAAAAAGATGAAGAACTACTGGAGGCAACTGCTCAACGCGAAATTGATCACCCAGAGAAAGTTCGATAACCTGACCAAGGCCGAGAGGGGCGGACTCTCCGAACTTGACAAAGCGGGCTTCATAAAGAGGCAGCTGGTCGAGACCCGACAGATCACGAAGCACGTGGCCCAAATCCTCGACAGCAGAATGAATACCAAGTACGATGAGAATGACAAACTCATCAGGGAAGTGAAAGTGATTACCCTGAAGAGCAAGTTGGTGTCCGACTTTCGCAAAGATTTCCAGTTCTACAAGGTGAGGGAGATCAACAACTACCACCATGCCCACGACGCATACCTGAACGCCGTGGTCGGCACCGCCCTGATTAAGAAGTATCCAAAGCTGGAGTCCGAATTTGTCTACGGCGACTACAAAGTTTACGATGTGAGGAAGATGATCGCTAAGAGCGAACAGGAGATCGGCAAGGCCACCGCTAAGTATTTCTTCTACAGCAACATCATGAACTTTTTCAAGACCGAGATCACACTTGCCAACGGCGAAATCAGGAAGAGGCCGCTTATCGAGACCAACGGTGAGACCGGCGAGATCGTGTGGGACAAGGGCAGGGACTTCGCCACCGTGAGGAAAGTCCTGAGCATGCCCCAGGTGAATATTGTGAAAAAAACTGAGGTGCAGACAGGCGGCTTTAGCAAGGAATCCATCCTGCCCAAGAGGAACAGCGACAAGCTGATCGCCCGGAAGAAGGACTGGGACCCTAAGAAGTATGGAGGCTTCGACAGCCCCACCGTAGCCTACAGCGTGCTGGTGGTCGCGAAGGTAGAGAAGGGGAAGAGCAAGAAACTGAAGAGCGTGAAGGAGCTGCTCGGCATAACCATCATGGAGAGGTCCAGCTTTGAGAAGAACCCCATTGACTTTTTGGAAGCCAAGGGCTACAAAGAGGTCAAAAAGGACCTGATCATCAAACTCCCCAAGTACTCCCTGTTTGAATTGGAGAACGGCAGAAAGAGGATGCTGGCGAGCGCTGGGGAACTGCAAAAGGGCAACGAACTGGCGCTGCCCAGCAAGTACGTGAATTTTCTGTACCTGGCGTCCCACTACGAAAAGCTGAAAGGCAGCCCCGAGGACAACGAGCAGAAGCAGCTGTTCGTGGAGCAGCACAAGCATTACCTGGACGAGATAATCGAGCAAATCAGCGAGTTCAGCAAGAGGGTGATTCTGGCCGACGCGAACCTGGATAAGGTCCTCAGCGCCTACAACAAGCACCGAGACAAACCCATCAGGGAGCAGGCCGAGAATATCATACACCTGTTCACCCTGACAAATCTGGGCGCACCTGCGGCATTCAAATACTTCGATACCACCATCGACAGGAAAAGGTACACTAGCACTAAGGAGGTGCTGGATGCCACCTTGATCCACCAGTCCATTACCGGCCTGTATGAGACCAGGATCGACCTGAGCCAGCTTGGAGGCGACTCTAGGGCGGACCCAAAAAAGAAAAGGAAGGTGGAATTCTCTAGAGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAATGAACCGGGGAGTCCCTTTTAGGCACTTGCTTCTGGTGCTGCAACTGGCGCTCCTCCCAGCAGCCACTCAGGGAAAGAAAGTGGTGCTGGGCAAAAAAGGGGATACAGTGGAACTGACCTGTACAGCTTCCCAGAAGAAGAGCATACAATTCCACTGGAAAAACTCCAACCAGATAAAGATTCTGGGAAATCAGGGCTCCTTCTTAACTAAAGGTCCATCCAAGCTGAATGATCGCGCTGACTCAAGAAGAAGCCTTTGGGACCAAGGAAACTTCCCCCTGATCATCAAGAATCTTAAGATAGAAGACTCAGATACTTACATCTGTGAAGTGGAGGACCAGAAGGAGGAGGTGCAATTGCTAGTGTTCGGATTGACTGCCAACTCTGACACCCACCTGCTTCAGGGGCAGAGCCTGACCCTGACCTTGGAGAGCCCCCCTGGTAGTAGCCCCTCAGTGCAATGTAGGAGTCCAAGGGGTAAAAACATACAGGGGGGGAAGACCCTCTCCGTGTCTCAGCTGGAGCTCCAGGATAGTGGCACCTGGACATGCACTGTCTTGCAGAACCAGAAGAAGGTGGAGTTCAAAATAGACATCGTGGTGCTAGCTTTCCAGAAGGCCTCCAGCATAGTCTATAAGAAAGAGGGGGAACAGGTGGAGTTCTCCTTCCCACTCGCCTTTACAGTTGAAAAGCTGACGGGCAGTGGCGAGCTGTGGTGGCAGGCGGAGAGGGCTTCCTCCTCCAAGTCTTGGATCACCTTTGACCTGAAGAACAAGGAAGTGTCTGTAAAACGGGTTACCCAGGACCCTAAGCTCCAGATGGGCAAGAAGCTCCCGCTCCACCTCACCCTGCCCCAGGCCTTGCCTCAGTATGCTGGCTCTGGAAACCTCACCCTGGCCCTTGAAGCGAAAACAGGAAAGTTGCATCAGGAAGTGAACCTGGTGGTGATGAGAGCCACTCAGCTCCAGAAAAATTTGACCTGTGAGGTGTGGGGACCCACCTCCCCTAAGCTGATGCTGAGCTTGAAACTGGAGAACAAGGAGGCAAAGGTCTCGAAGCGGGAGAAGGCGGTGTGGGTGCTGAACCCTGAGGCGGGGATGTGGCAGTGTCTGCTGAGTGACTCGGGACAGGTCCTGCTGGAATCCAACATCAAGGTTCTGCCCACATGGTCGACCCCGGTGCAGCCAATGGCCCTGATTGTGCTGGGGGGCGTCGCCGGCCTCCTGCTTTTCATTGGGCTAGGCATCTTCTTCTGTGTCAGGTGCCGGCACACCGGTTAGTAATGAGTTTAAACGGGGGAGGCTAACTGAAACACGGAAGGAGACAATACCGGAAGGAACCCGCGCTATGACGGCAATAAAAAGACAGAATAAAACGCACGGGTGTTGGGTCGTTTGTTCATAAACGCGGGGTTCGGTCCCAGGGCTGGCACTCTGTCGATACCCCACCGAGACCCCATTGGGGCCAATACGCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAAGTTCGGGTGAAGGCCCAGGGCTCGCAGCCAACGTCGGGGCGGCAGGCCCTGCCATAGCAGATCTGCGCAGCTGGGGCTCTAGGGGGTATCCCCACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTAATTAAGGTCGGGCAGGAAGAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTAGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGNNNNNNNNNNNNNNNNNNNNGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTCTAGTATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCTCCCGATCCCCTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATTGCATGAAGAATCTGCTTAGG

TABLE 10Nucleotide sequence of the vector shown in FIG. 18. (SEQ ID NO: 37)GTTAGGCGTTTTGCGCTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTCGTTTAGTGAACCGTCAGATCGCCTGGAGACGCCATCCACGCTGTTTTGACCTCCATAGAAGACACCGGGACCGATCCAGCCTCCGGAGCGGCCGCCACCATGGGCAAGCCCATCCCTAACCCCCTGTTGGGGCTGGACAGCACCGCTCCCAAAAAGAAAAGGAAGGTGGGCATTCACGGCGTGCCTGCGGCCGACAAAAAGTACAGCATCGGCCTTGATATCGGCACCAATAGCGTGGGCTGGGCCGTTATCACAGACGAATACAAGGTACCCAGCAAGAAGTTCAAGGTGCTGGGGAATACAGACAGGCACTCTATCAAGAAAAACCTTATCGGGGCTCTGCTGTTTGACTCAGGCGAGACCGCCGAGGCCACCAGGTTGAAGAGGACCGCAAGGCGAAGGTACACCCGGAGGAAGAACAGGATCTGCTATCTGCAGGAGATCTTCAGCAACGAGATGGCCAAGGTGGACGACAGCTTCTTCCACAGGCTGGAGGAGAGCTTCCTTGTCGAGGAGGATAAGAAGCACGAACGACACCCCATCTTCGGCAACATAGTCGACGAGGTCGCTTATCACGAGAAGTACCCCACCATCTACCACCTGCGAAAGAAATTGGTGGATAGCACCGATAAAGCCGACTTGCGACTTATCTACTTGGCTCTGGCGCACATGATTAAGTTCAGGGGCCACTTCCTGATCGAGGGCGACCTTAACCCCGACAACAGTGACGTAGACAAATTGTTCATCCAGCTTGTACAGACCTATAACCAGCTGTTCGAGGAAAACCCTATTAACGCCAGCGGGGTGGATGCGAAGGCCATACTTAGCGCCAGGCTGAGCAAAAGCAGGCGCTTGGAGAACCTGATAGCCCAGCTGCCCGGTGAAAAGAAGAACGGCCTCTTCGGTAATCTGATTGCCCTGAGCCTGGGCCTGACCCCCAACTTCAAGAGCAACTTCGACCTGGCAGAAGATGCCAAGCTGCAGTTGAGTAAGGACACCTATGACGACGACTTGGACAATCTGCTCGCCCAAATCGGCGACCAGTACGCTGACCTGTTCCTCGCCGCCAAGAACCTTTCTGACGCAATCCTGCTTAGCGATATCCTTAGGGTGAACACAGAGATCACCAAGGCCCCCCTGAGCGCCAGCATGATCAAGAGGTACGACGAGCACCATCAGGACCTGACCCTTCTGAAGGCCCTGGTGAGGCAGCAACTGCCCGAGAAGTACAAGGAGATCTTTTTCGACCAGAGCAAGAACGGCTACGCCGGCTACATCGACGGCGGAGCCAGCCAAGAGGAGTTCTACAAGTTCATCAAGCCCATCCTGGAGAAGATGGATGGCACCGAGGAGCTGCTGGTGAAGCTGAACAGGGAAGATTTGCTCCGGAAGCAGAGGACCTTTGACAACGGTAGCATCCCCCACCAGATCCACCTGGGCGAGCTGCACGCAATACTGAGGCGACAGGAGGATTTCTACCCCTTCCTCAAGGACAATAGGGAGAAAATCGAAAAGATTCTGACCTTCAGGATCCCCTACTACGTGGGCCCTCTTGCCAGGGGCAACAGCCGATTCGCTTGGATGACAAGAAAGAGCGAGGAGACCATCACCCCCTGGAACTTCGAGGAAGTGGTGGACAAAGGAGCAAGCGCGCAGTCTTTCATCGAACGGATGACCAATTTCGACAAAAACCTGCCTAACGAGAAGGTGCTGCCCAAGCACAGCCTGCTTTACGAGTACTTCACCGTGTACAACGAGCTCACCAAGGTGAAATATGTGACCGAGGGCATGCGAAAACCCGCTTTCCTGAGCGGCGAGCAGAAGAAGGCCATCGTGGACCTGCTGTTCAAGACCAACAGGAAGGTGACCGTGAAGCAGCTGAAGGAGGACTACTTCAAGAAGATCGAGTGCTTTGATAGCGTGGAAATAAGCGGCGTGGAGGACAGGTTCAACGCCAGCCTGGGCACCTACCACGACTTGTTGAAGATAATCAAAGACAAGGATTTCCTGGATAATGAGGAGAACGAGGATATACTCGAGGACATCGTGCTGACTTTGACCCTGTTTGAGGACCGAGAGATGATTGAAGAAAGGCTCAAAACCTACGCCCACCTGTTCGACGACAAAGTGATGAAACAACTGAAGAGACGAAGATACACCGGCTGGGGCAGACTGTCCAGGAAGCTCATCAACGGCATTAGGGACAAGCAGAGCGGCAAGACCATCCTGGATTTCCTGAAGTCCGACGGCTTCGCCAACCGAAACTTCATGCAGCTGATTCACGATGACAGCTTGACCTTCAAGGAGGACATCCAGAAGGCCCAGGTTAGCGGCCAGGGCGACTCCCTGCACGAACATATTGCAAACCTGGCAGGCTCCCCTGCGATCAAGAAGGGCATACTGCAGACCGTTAAGGTTGTGGACGAATTGGTCAAGGTCATGGGCAGGCACAAGCCCGAAAACATAGTTATAGAGATGGCCAGAGAGAACCAGACCACCCAAAAGGGCCAGAAGAACAGCCGGGAGCGCATGAAAAGGATCGAGGAGGGTATCAAGGAACTCGGAAGCCAGATCCTCAAAGAGCACCCCGTGGAGAATACCCAGCTCCAGAACGAGAAGCTGTACCTGTACTACCTGCAGAACGGCAGGGACATGTACGTTGACCAGGAGTTGGACATCAACAGGCTTTCAGACTATGACGTGGATCACATAGTGCCCCAGAGCTTTCTTAAAGACGATAGCATCGACAACAAGGTCCTGACCCGCTCCGACAAAAACAGGGGCAAAAGCGACAACGTGCCAAGCGAAGAGGTGGTTAAAAAGATGAAGAACTACTGGAGGCAACTGCTCAACGCGAAATTGATCACCCAGAGAAAGTTCGATAACCTGACCAAGGCCGAGAGGGGCGGACTCTCCGAACTTGACAAAGCGGGCTTCATAAAGAGGCAGCTGGTCGAGACCCGACAGATCACGAAGCACGTGGCCCAAATCCTCGACAGCAGAATGAATACCAAGTACGATGAGAATGACAAACTCATCAGGGAAGTGAAAGTGATTACCCTGAAGAGCAAGTTGGTGTCCGACTTTCGCAAAGATTTCCAGTTCTACAAGGTGAGGGAGATCAACAACTACCACCATGCCCACGACGCATACCTGAACGCCGTGGTCGGCACCGCCCTGATTAAGAAGTATCCAAAGCTGGAGTCCGAATTTGTCTACGGCGACTACAAAGTTTACGATGTGAGGAAGATGATCGCTAAGAGCGAACAGGAGATCGGCAAGGCCACCGCTAAGTATTTCTTCTACAGCAACATCATGAACTTTTTCAAGACCGAGATCACACTTGCCAACGGCGAAATCAGGAAGAGGCCGCTTATCGAGACCAACGGTGAGACCGGCGAGATCGTGTGGGACAAGGGCAGGGACTTCGCCACCGTGAGGAAAGTCCTGAGCATGCCCCAGGTGAATATTGTGAAAAAAACTGAGGTGCAGACAGGCGGCTTTAGCAAGGAATCCATCCTGCCCAAGAGGAACAGCGACAAGCTGATCGCCCGGAAGAAGGACTGGGACCCTAAGAAGTATGGAGGCTTCGACAGCCCCACCGTAGCCTACAGCGTGCTGGTGGTCGCGAAGGTAGAGAAGGGGAAGAGCAAGAAACTGAAGAGCGTGAAGGAGCTGCTCGGCATAACCATCATGGAGAGGTCCAGCTTTGAGAAGAACCCCATTGACTTTTTGGAAGCCAAGGGCTACAAAGAGGTCAAAAAGGACCTGATCATCAAACTCCCCAAGTACTCCCTGTTTGAATTGGAGAACGGCAGAAAGAGGATGCTGGCGAGCGCTGGGGAACTGCAAAAGGGCAACGAACTGGCGCTGCCCAGCAAGTACGTGAATTTTCTGTACCTGGCGTCCCACTACGAAAAGCTGAAAGGCAGCCCCGAGGACAACGAGCAGAAGCAGCTGTTCGTGGAGCAGCACAAGCATTACCTGGACGAGATAATCGAGCAAATCAGCGAGTTCAGCAAGAGGGTGATTCTGGCCGACGCGAACCTGGATAAGGTCCTCAGCGCCTACAACAAGCACCGAGACAAACCCATCAGGGAGCAGGCCGAGAATATCATACACCTGTTCACCCTGACAAATCTGGGCGCACCTGCGGCATTCAAATACTTCGATACCACCATCGACAGGAAAAGGTACACTAGCACTAAGGAGGTGCTGGATGCCACCTTGATCCACCAGTCCATTACCGGCCTGTATGAGACCAGGATCGACCTGAGCCAGCTTGGAGGCGACTCTAGGGCGGACCCAAAAAAGAAAAGGAAGGTGGAATTCTCTAGAGGCAGTGGAGAGGGCAGAGGAAGTCTGCTAACATGCGGTGACGTCGAGGAGAATCCTGGCCCAATGAACCTGAGCAAAAACGTGAGCGTGAGCGTGTATATGAAGGGGAACGTCAACAATCATGAGTTTGAGTACGACGGGGAAGGTGGTGGTGATCCTTATACAGGTAAATATTCCATGAAGATGACGCTACGTGGTCAAAATTCCCTACCCTTTTCCTATGATATCATTACCACGGCATTTCAGTATGGTTTCCGCGTATTTACAAAATACCCTGAGGGAATTGTTGACTATTTTAAGGACTCGCTTCCCGACGCATTCCAGTGGAACAGACGAATTGTGTTTGAAGATGGTGGAGTACTAAACATGAGCAGTGATATCACATATAAAGATAATGTTCTGCATGGTGACGTCAAGGCTGAGGGAGTGAACTTCCCGCCGAATGGGCCAGTGATGAAGAATGAAATTGTGATGGAGGAACCGACTGAAGAAACATTTACTCCAAAAAACGGGGTTCTTGTTGGCTTTTGTCCCAAAGCGTACTTACTTAAAGACGGTTCCTATTACTATGGAAATATGACAACATTTTACAGATCCAAGAAATCTGGCCAGGCACCTCCTGGGTATCACTTTGTTAAGCATCGTCTCGTCAAGACCAATGTGGGACATGGATTTAAGACGGTTGAGCAGACTGAATATGCCACTGCTCATGTCAGTGATCTTCCCAAGTTCGAAGCTTGATAATGAGTTTAAACGGGGGAGGCTAACTGAAACACGGAAGGAGACAATACCGGAAGGAACCCGCGCTATGACGGCAATAAAAAGACAGAATAAAACGCACGGGTGTTGGGTCGTTTGTTCATAAACGCGGGGTTCGGTCCCAGGGCTGGCACTCTGTCGATACCCCACCGAGACCCCATTGGGGCCAATACGCCCGCGTTTCTTCCTTTTCCCCACCCCACCCCCCAAGTTCGGGTGAAGGCCCAGGGCTCGCAGCCAACGTCGGGGCGGCAGGCCCTGCCATAGCAGATCTGCGCAGCTGGGGCTCTAGGGGGTATCCCCACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATTAGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAACTGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACGCGAATTAATTAAGGTCGGGCAGGAAGAGGGCCTATTTCCCATGATTCCTTCATATTTGCATATACGATACAAGGCTGTTAGAGAGATAATTAGAATTAATTTGACTGTAAACACAAAGATATTAGTACAAAATACGTGACGTAGAAAGTAATAATTTCTTGGGTAGTTTGCAGTTTTAAAATTATGTTTTAAAATGGACTATCATATGCTTACCGTAACTTGAAAGTATTTCGATTTCTTGGCTTTATATATCTTGTGGAAAGGACGAAACACCGNNNNNNNNNNNNNNNNNNNNGTTTTAGAGCTAGAAATAGCAAGTTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTCGGTGCTTTTTTCTAGTATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCTCCCGATCCCCTATGGTGCACTCTCAGTACAATCTGCTCTGATGCCGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGCTTGACCGACAATTGCATGAAGAATCTGCTTAGG

Example 2: Rapid and Highly Efficient Mammalian Cell Engineering ViaCas9 Protein Transfection

Abstract

CRISPR-Cas9 systems provide a platform for high efficiency genomeediting that are enabling innovative applications of mammalian cellengineering. However, the delivery of Cas9 and synthesis of guide RNA(gRNA) remain as steps that can limit overall efficiency and generalease of use. Described here are methods for rapid synthesis of gRNA andfor delivery of Cas9 protein/gRNA ribonucleoprotein complexes (Cas9RNPs) into a variety of mammalian cells through liposome-mediatedtransfection or electroporation. Using these methods, nuclease-mediatedindel rates of up to 94% in Jurkat T cells and 87% in inducedpluripotent stem cells (iPSC) for a single target are reported. Whenthis approach is used for multigene targeting in Jurkat cells, it wasfound that two-locus and three-locus indels were achieved inapproximately 93% and 65% of the resulting isolated cell lines,respectively. Further, in this study, it was found that the off-targetcleavage rate is significantly reduced using Cas9 protein when comparedto plasmid DNA transfection. Taken together, a streamlined cellengineering workflow is presented that enables gRNA design to analysisof edited cells in as little as four days and results in highlyefficient genome modulation in hard-to-transfect cells. The reagentpreparation and delivery to cells requires no plasmid manipulation, andis thus amenable to high throughput, multiplexed genome-wide cellengineering.

Introduction

CRISPR-Cas9 mediated genome engineering enables researchers to modifygenomic DNA in vivo directly and efficiently (Cho et al., “Targetedgenome engineering in human cells with the Cas9 RNA-guidedendonuclease,” Nat. Biotechnol. 31:230-232 (2013); Mali et al.,“RNA-guided human genome engineering via Cas9,” Science 339:823-826(2013); Jiang et al., “RNA-guided editing of bacterial genomes usingCRISPR-Cas systems,” Nat. Biotechnol. 31:233-239 (2013); Wang et al.,“One-step generation of mice carrying mutations in multiple genes byCRISPR/Cas-mediated genome engineering,” Cell 153:910-918 (2013)). Threecomponents (Cas9, mature crRNA and tracrRNA) are essential forfunctional activity. Although the mature crRNA and tracrRNA can besynthesized chemically, the quality of the synthetic RNA is notsufficient for in vivo cell engineering due to the presence of truncatedby-products (data not shown). Therefore, templates for the mature crRNAand tracrRNA or a combined single gRNA are often cloned into a Cas9expression plasmid or built into separate plasmids driven by either U6or H1promoters for transcription after transfection of mammalian cells.Because the constructs are relatively large, delivery rates can be low,which would limit genomic cleavage efficiency, especially forhard-to-transfect cells. Recently, the use of Cas9 delivered as mRNA hasled to increases in the rate of genomic cleavage in some cells. Forexample, a mixture of Cas9 mRNA and a single species of gRNA wereco-injected into mouse embryonic stem (ES) cells resulting in biallelicmutations in 95% of newborn mice (Wang et al., “One-step generation ofmice carrying mutations in multiple genes by CRISPR/Cas-mediated genomeengineering,” Cell 153:910-918 (2013)). To make guide RNA, oftenprecloned plasmid is used directly or a linear template is created viaPCR amplification of the targeting sequence from a plasmid. If a 5′ T7promoter does not appear in the plasmid, it is often added at this stepand the resulting PCR product can be used in an in vitro transcriptionreaction. Alternatively, a synthetic DNA fragment containing a T7promoter, crRNA and tracerRNA can be used as a template to prepare agRNA by in vitro transcription. Overall, these represent alabor-intensive and time-consuming workflow, which led us to seek asimpler method to synthesize high quality gRNA. To that, describe hereis a streamlined modular approach for gRNA production in vitro. Startingwith two short single stranded oligos, the gRNA template is assembled ina ‘one pot’ PCR reaction. The product is then used as template in an invitro transcription (IVT) reaction which is followed by a rapidpurification step, yielding transfection-ready gRNA in as little as fourhours.

To streamline the cell engineering workflow further, it was sought toeliminate any remaining cellular transcription or translation bydirectly introducing Cas9/gRNA ribonucleoprotein (RNP) complexesdirectly to the cells. Microinjection of Cas9 protein and gRNA complexesinto C. elegans was first described in 2013 (Cho et al., “Heritable geneknockout in Caenorhabditis elegans by direct injection of Cas9-sgRNAribonucleoproteins,” Genetics 195:1177-1180 (2013)) and was subsequentlyused to generate gene-knockout mice and zebrafish with mutation rates ofup to 93% in newborn mice (Sung et al., “Highly efficient gene knockoutin mice and zebrafish with RNA-guided endonucleases,” Genome Res.24:125-131 (2014)). Following that report, Cas9 protein/gRNA complexeswere delivered into cultured human fibroblasts and induced pluripotentstem cells (iPSC) via electroporation with high efficiency andrelatively low off-target effects (Kim et al., “Highly efficientRNA-guided genome editing in human cells via delivery of purified Cas9ribonucleoproteins” Genome Res. 24:1012-1019 (2014)). In that study, alarge amount of Cas9 protein (4.5 to 45 μg) and gRNA (6 to 60 μg) werenecessary for efficient genome modification (up to 79% indelefficiency). Most recently, delivery of Cas9 protein-associated gRNAcomplexes via liposomes was reported, in which RNAiMAX was used todeliver Cas9:sgRNA nuclease complexes into cultured human cells and intothe mouse inner ear in vivo with up to 80% and 20% genome modificationefficiency respectively (Zuris et al., “Cationic lipid-mediated deliveryof proteins enables efficient protein-based genome editing in vitro andin vivo” Nat Biotechnol. October 30. doi: 10.1038/nbt.3081 (2014)).

The CRISPR/Cas system has been demonstrated as an efficientgene-targeting tool for multiplexed genome editing (Wang et al.,“One-step generation of mice carrying mutations in multiple genes byCRISPR/Cas-mediated genome engineering,” Cell 153:910-918 (2013); Kabadiet al., “Multiplex CRISPR/Cas9-based genome engineering from a singlelentiviral vector” Nucleic Acids Res. October 29; 42(19):e147. doi:10.1093/nar/gku749 (2014); Sakuma et al., “Multiplex genome engineeringin human cells using all-in-one CRISPR/Cas9 vector system,” Sci Rep.June 23; 4:5400. doi: 10.1038/srep05400 (2014); Cong et al., “Multiplexgenome engineering using CRISPR/Cas systems. Science. 339: 819-823(2013)). For example, co-transfections of mouse ES cells with constructsexpressing Cas9 and three sgRNAs targeting Tet1, 2, and 3 resulted in20% of cells having mutations in all six alleles of the three genesbased on restriction fragment length polymorphism (RFLP) assay (Wang etal., “One-step generation of mice carrying mutations in multiple genesby CRISPR/Cas-mediated genome engineering,” Cell 153:910-918 (2013)).Lentiviral delivery of a single vector expressing Cas9 and four sgRNAsinto primary human dermal fibroblasts resulted in about 30% simultaneousediting of four genomic loci among ten clonal populations based upongenomic cleavage detection assays (Kabadi et al., “MultiplexCRISPR/Cas9-based genome engineering from a single lentiviral vector”Nucleic Acids Res. October 29; 42(19):e147. doi: 10.1093/nar/gku749(2014)). In one recent study, ‘all-in-one’ expression vectors containingseven guide RNA expression cassettes and a Cas9 nuclease/nickaseexpression cassette were delivered into 293T cells with genome cleavageefficiency ranging from 4 to 36% for each individual target (Sakuma etal., “Multiplex genome engineering in human cells using all-in-oneCRISPR/Cas9 vector system,” Sci Rep. June 23; 4:5400. doi:10.1038/srep05400 (2014)). In general, the efficiency of editingmultiple genes in the human genome using plasmid-based delivery methodsremains relatively low which subsequently increases the workload fordownstream clonal isolation.

An in vitro gRNA production system has been developed and used asystematic approach to optimize the conditions for delivery of Cas9:gRNAcomplexes via lipid-mediated transfection or electroporation. A varietyof mammalian cell lines were tested, including primary cells and otherhard-to-transfect cells. Plasmid DNA, mRNA and Cas9 proteintransfections were evaluated side by side. Using Cas9 proteintransfection via electroporation, a superior genome editing efficiencieseven in hard-to-transfect cells was achieved. In addition, the genomeediting of multiple targets simultaneously using the Cas9 RNPs deliverysystem were assessed and are described here. It was found that deliveryof Cas9 RNPs not only led to high indel production at single locus, butsupports highly efficient biallelic modulation of at least two genes ina single transfection.

Materials and Methods

Materials:

293FT cells, The Gibco® Human Episomal iPSC Line, DMEM medium, RPMI 1640medium, IMDM, DMEM/F-12, Fetal Bovine Serum (FBS), Knockout™ SerumReplacement, Non-Essential Amino Acid solution, basic fibroblast growthfactor, Collagenase IV, TrypLE™ Express Enzyme, Geltrex, Opti-MEMMedium, FluoroBrite™ DMEM, Lipofectamine 2000, Lipofectamine 3000,RNAiMAX, Lipofectamine® MessengerMAX, GeneArt® CRISPR Nuclease Vectorwith OFP Reporter, 2% E-Gel® EX Agarose Gels, PureLink® PCR Micro Kit,TranscriptAid T7 High Yield Transcription Kit, MEGAclear™ TranscriptionClean-Up Kit, Zero Blunt® TOPO® PCR Cloning Kit, PureLink® Pro Quick96Plasmid Purification Kit, Endotoxin Quantitation Kit, Qubit® RNA BRAssay Kit, TRA-1-60 Alexa Fluor® 488 conjugated antibodies, SSEA4 AlexaFluor®647, and Phusion Flash High-Fidelity PCR Master Mix were fromThermo Fisher Scientific. Jurkat T cells and K562 cells were obtainedfrom the American Type Culture Collection (ATCC). MEF feeder cells andROCK inhibitor Y-27632 were purchased from EMD Millipore. MonoclonalCas9 antibody was ordered from Diagenode. Recombinant Cas9 protein waspurified as described by Kim et al. (7). All oligonucleotides used forgRNA synthesis were from Thermo Fisher Scientific (Supplementary Table1s).

One Step Synthesis of gRNA Template

The 80 nt constant region of tracrRNA from a GeneArt® CRISPR NucleaseVector was amplified by PCR and purified via agarose gel extraction. Theconcentration of PCR product was measured by Nanodrop (Thermo FisherScientific) and the molarity was calculated based on the molecularweight of 49.6 kDa. To prepare a pool of oligonucleotides, an aliquot ofthe 80 nt PCR product was mixed with two end primers and target-specificforward and reverse primers, with a final concentration of 0.15 μM forthe 80 nt PCR product and 10 μM for each of the end primers. For aspecific target, a 34 nt forward primer consisting of the T7 promotersequence and 5′ end target sequence, and a 34 nt reverse primerconsisting of the target sequence and 5′ end tracrRNA sequence werechemically synthesized with a 15 nt overlap. To set up the synthesis ofgRNA template, aliquots of the pooled oligonucleotides were added to aPhusion Flash High-Fidelity PCR Master Mix and amplified usingmanufacturer's recommended reaction conditions. The PCR product wasanalyzed by a 2% E-Gel® EX Agarose Gel, followed by purification usingPurelink PCR micro column. The gRNA template was eluted with 13 μl waterand the concentration was determined by Nanodrop instrument.

To determine the error rate, the PCR product was cloned into Zero Blunt®TOPO® vector, followed by plasmid DNA isolation and sequencing with a3500×1 DNA analyzer (Thermo Fisher Scientific).

In Vitro Transcription

The in vitro transcription of gRNA template was carried out usingTranscriptAid T7 High Yield Transcription Kit using the manufacturer'srecommended conditions. The gRNA product was purified using MEGAclear™Transcription Clean-Up kit as described in the manual. The concentrationof RNA was determined using Qubit® RNA BR Assay Kit.

Cell Culture

HEK 293FT cells were maintained in DMEM medium supplemented with 10%FBS. Jurkat T cells were propagated in RPMI medium containing 10% FBS,whereas K562 cells were cultured in IMDM medium supplemented with 10%FBS. Feeder-dependent human episomal iPSC were cultured on mitoticallyinactivated MEF feeder cells in human ESC (hESC) media containing 20%Knockout™ Serum Replacement, 10 μM Non-Essential Amino Acid solution, 55μM 2-Mercaptoethanol, and 4 ng/ml basic fibroblast growth factor inDMEM/F-12. All cultures were maintained in a 5% CO₂, 37° C. humidifiedincubator. iPSC cultures were maintained with daily media changes andwere passaged regularly using Collagenase IV.

Lipid-Mediated Cell Transfection

One day prior to transfection, the cells were seeded in a 24-well plateat a cell density of 2.5×10⁵ cells per well. For plasmid DNAtransfection, 0.5 μg DNA was added to 25 μl of Opti-MEM medium, followedby addition of 25 μl of Opti-MEM containing 2 μl of Lipofectamine 2000.The mixture was incubated at room temperature for 15 minutes and thenadded to the cells. For Cas9 mRNA transfection, 0.5 μg Cas9 mRNA (ThermoFisher Scientific) was added to 25 μl of Opti-MEM, followed by additionof 50-100 ng gRNA. Meanwhile, 2 μl of Lipofectamine 3000 was dilutedinto 25 μl of Opti-MEM and then mixed with mRNA/gRNA sample. The mixturewas incubated for 15 minutes prior to addition to the cells. For Cas9protein transfection, 500 ng of purified Cas9 protein (Thermo FisherScientific) was added to 25 μl of Opti-MEM medium, followed by additionof 120 ng gRNA. The molar ratio of gRNA over Cas9 protein wasapproximately 1:1.2. The sample was mixed by gently tapping the tubes afew times and then incubated at room temperature for 10 minutes. To aseparate test tube, 2 μl of RNAiMAX or Lipofectamine 3000 was added to25 μl of Opti-MEM medium. The diluted transfection reagent wastransferred to the tube containing Cas9 protein/gRNA complexes, followedby incubation at room temperature for 15 minutes. The entire solutionwas then added to the cells in a 24-well plate and mixed by gentlyswirling the plate. The plate was incubated at 37° C. for 48 hours in a5% CO₂ incubator. The percentage of locus-specific indel formation wasmeasured by GeneArt® Genomic Cleavage Detection Kit. The bandintensities were quantitated using built-in software in Alpha Imager(Bio-Rad).

Electroporation

For suspension cells, such as Jurkat T cells or K562 cells, 2×10⁵ cellswere used per electroporation using Neon® Transfection System 10 μL Kit(Thermo Fisher Scientific). To maximize the genome cleavage efficiency,the Neon 24 optimized protocol was applied according to themanufacturer's instruction. To set up a master mix, 24 μg of purifiedCas9 protein was added to 240 μL of Resuspension Buffer R provided inthe kit, followed by addition of 4.8 μg of gRNA. The mixture wasincubated at room temperature for 10 minutes. Meanwhile, 4.8×10⁶ cellswere transferred to a sterile test tube and centrifuged at 500×g for 5minutes. The supernatant was aspirated and the cell pellet wasresuspended in 1 ml of PBS without Ca²⁺ and Mg²⁺. Upon centrifugation,the supernatant was carefully aspirated so that almost all the PBSbuffer was removed with no or minimum loss of cells. The ResuspensionBuffer R containing the Cas9 protein/gRNA complexes was then used toresuspend the cell pellets. A 10 μl cell suspension was used for each ofthe 24 optimized conditions, which varied in pulse voltage, pulse widthand the number of pulses. The electroporated cells were transferredimmediately to a 24 well containing 0.5 ml of the corresponding growthmedium and then incubated for 48 hours in a 5% CO₂ incubator. The cellswere harvested by centrifugation and then washed once with PBS, followedby Genomic Cleavage and Detection assay as described by the manual. Uponoptimization of electroporation condition, a higher amount of Cas9protein (1.5 to 2 μg) and gRNA (300 to 400 ng) could be applied tofurther increase the genome editing efficiency. For each target in themultiplexing assays, 1 to 2 μg of Cas9 protein and 200-400 ng of gRNAwere pre-incubated separately prior to mixing with cell pellet forelectroporation. For clonal isolation, the cell number of transfectedcells was counted upon 48 hour incubation, followed by a serial ofdilution to 96 well plates with a cell density of 10-20 cells per platebased on the cell count. After clonal expansion for three weeks, cellsfrom each individual well were harvested, followed by PCR amplificationof the target locus. The PCR fragments were then cloned using a TOPOvector and transformed into TOP10 competent cells. Approximately 8 E.coli colonies were randomly picked for sequencing for each individualtarget locus. The single cell population was determined by thehomogeneity of sequences for each allele. Single cells containingbi-allelic mutations on all desired targets were considered homozygoticindels. Downstream sequence analysis to confirm frame-shift induced stopcodon introduction was not done.

For transfection of feeder free adaptation of iPSC, feeder dependentiPSC were grown to 80% confluency prior to harvest with collagenase.Following removal of the cell clusters from the feeder layer, they weregravity sedimented to prevent MEF contamination. The cell clusters werethen seeded on to tissue culture dishes coated with Geltrex® in MEFconditioned media supplemented with 4 ng/mL bFGF. MEF conditioned mediawas produced using inactivated feeder cells, which was harvested on 7continuous days, sterile filtered and frozen until usage. The cultureswere allowed to reach 80-90% confluence. The day prior to transfection,the cultures were pretreated with 5 μM ROCK inhibitor Y-27632. On theday of harvest the cultures were inspected for signs of differentiationand any contamination differentiated cells were removed viamicro-dissection. The cultures were washed once with DPBS and thenharvested using TrypLE™ Express Enzyme. Single cells suspensions werecounted using the Countess® automated cell counter. Followingtransfections, the cells were seeded onto multi-well (24 well) tissueculture dish coated with Geltrex® and incubated overnight with MEFconditioned media containing 5 μM ROCK. Media was replaced daily,without ROCK inhibitor, prior to analysis.

Cell Surface Immunostaining

To ensure maintenance of pluripotency post transfection and genomeediting, iPSC cells were tested for expression of cell surface markersof self-renewal. The wells to be probed were washed with DMEM/F12 basalmedia. TRA-1-60 Alexa Fluor® 488 conjugated antibodies and SSEA4 AlexaFluor®647 were multiplexed in basal DMEM/F-12 media. Both antibodieswere added at a concentration of 2 μl of each antibody into 0.5 mL ofpre-warmed DMEM/F-12 media and incubated at 37° C. for 45 minutes.Following the incubation, the antibody solution was removed and thewells were washed twice with DMEM/F-12. Prior to observation the mediawas exchanged with pre-warmed FluoroBrite™ DMEM. Images were taken usinga Zeiss Axiovision microscope using a FITC and Cy5 laser/filtercombination.

Analysis of Pluripotency Markers

Cultures were detached and dissociated using TrypLE™ Select andtrituration. Single cell suspensions were incubated with TRA-1-60 AlexaFluor® 488 conjugated antibodies and SSEA4 Alexa Fluor®647 for 1 hour atroom temperature with gentle agitation. Two microliters (50×) of eachantibody were added to 0.5 mL of DMEM/F-12. Following the incubation,the cells were centrifuged and washed once with Dulbecco'sPhosphate-Buffered Saline (DPBS). After the removal of the DPBS wash,the pelleted cells were gently re-suspended in 1 mL of DPBS and stainedthrough a strainer capped tube. The cells were then measured for theexpression of both markers using the ATTUNE® Acoustic Focusing Cytometerand the data was analyzed using FlowJo software.

Western Blot Analysis

293FT cells were transfected with either plasmid DNA, mRNA or Cas9protein as described above. Cells were harvested at indicated times toperform both Genome Cleavage and Detection assay and Western Blotanalysis. The cell lysate was fractionated using a 4-12% Novex Bis-trisgel. The proteins were transferred to a PVDF membrane using an iBlotfollowing the manufacturer's protocol. Upon blocking, the membrane wasincubated for 2 hours with monoclonal mouse Cas9 antibody at 1:3000dilution. After washing, the membrane was incubated for 1 hour withrabbit anti-mouse antibody-HRP conjugate at 1:2000 dilution. Uponextensive washing, the membrane was developed with Pierce ECL reagent,followed by imaging using a Fuji imager LAS 4000 instrument.

Results

Three Day Cell Engineering Workflow

To streamline the genome engineering workflow, it was sought to simplifythe gRNA synthesis procedure and shorten the time from experimentaldesign to initial analysis as much as possible. Presented herein is aprocess where on day 1, the researcher designs and orders short DNAoligonucleotides and seeds the cells of interest for next daytransfection (FIG. 19). Upon receiving the oligonucleotides on day 2,the researcher assembles the gRNA template in less than 1 hour by ‘onepot’ PCR. The resulting PCR product is then subjected to in vitrotranscription to synthesize gRNA in approximately 3 hours. Uponassociation of gRNA with purified Cas9 protein, the Cas9 protein/gRNAcomplexes (Cas9 RNPs) are used to transfect cells via lipid-mediateddelivery or electroporation. As early as day 3 (24 hours posttransfection), the cells can be harvested for analysis of locus-specificgenome modification efficiency.

To assemble the DNA template for gRNA production, a total of 4 syntheticDNA oligonucleotides and a purified PCR product representing theconstant (non-targeting) crRNA region and tracrRNA sequence (gRNAlacking target sequence) are used (FIG. 20A). A pair of 34 nt forwardand reverse oligonucleotide primers were designed by an online web tool(Beta Testing Version, Thermo Fisher Scientific), and share 15 nthomology with the CRISPR and tracer RNA regions respectively. Theoligonucleotide pool concentrations as well as the PCR conditions wereoptimized such that the template was amplified in less than 40 minutesin a single tube with no obvious by-products (FIG. 20B). The gRNAtemplate was used directly to prepare gRNA via in vitro transcription(IVT). The resulting gRNA was purified yielding high levels of gRNA withno detectable by-products (FIG. 20C). This approach was validated bysynthesis of more than 96 distinct gRNAs. To determine the error rate inthe synthetic DNA template, the PCR fragments were cloned and sequencedand it was found that approximately 7% of gRNA templates harboredmutations, mainly small deletions occurring at the extreme 3′ end and 5′ends of the mature template. The use of HPLC-purified end primersfurther decreased the error rate to 3.6% with no mutations detected inthe target region, which was similar to what was observed with thecontrol template prepared from an ‘all-in-one’ plasmid with a 2% errorrate (FIG. 20D). Taken together, this optimized process facilitates theconversion of a small set of DNA oligonucleotides into purified gRNA inapproximately 4 hours with an accuracy of 96% and no errors detected inthe targeting or Cas9 complexing (cr/tracrRNA) regions. Given that theprocess consists solely of liquid handling PCR, transcription, and RNAisolation steps, it is well suited for high throughput gRNA productionand screening.

Liposome-Mediated Cas9 Protein Transfection

To examine the activity of synthetic gRNA, pre-complexed purifiedsynthetic IVT gRNA with Cas9 protein were produced. It was hypothesizingthat creating complexes of purified gRNAs with Cas9 protein prior todelivery to the cells might lead to higher genome editing efficiency dueto the protection of the gRNA as it transits to the nucleus during thetransfection process. To examine in vivo functionality of the system,human embryonic kidney (HEK293) cells were transfected withpre-complexed Cas9/gRNA ribonucleoproteins (Cas9 RNPs) using a set ofcationic lipid reagents, followed by a genomic cleavage detection assay.Interestingly, the commonly-used plasmid DNA or RNA transfectionreagents were able to efficiently deliver Cas9 RNPs. Lipofectamine 3000and RNAiMAX outperformed Lipofectamine 2000 in HEK 293 cells (data notshown), which is in agreement with the recent finding that RNAiMAXperformed better than Lipofectamine 2000 for delivery of Cas9 mRNA(Zuris et al., “Cationic lipid-mediated delivery of proteins enablesefficient protein-based genome editing in vitro and in vivo” NatBiotechnol. October 30. doi: 10.1038/nbt.3081 (2014)). For proteintransfection, serum-free medium is generally used to avoid serum proteininference. In this study however, it was observed that the completemedium containing 10% FBS facilitated protein transfection and genomemodification (FIG. 21A). The efficiencies of genome editing via plasmidDNA, mRNA and Cas9 RNP transfection were evaluated using three differenttarget loci, HPRT, AAVS and RelA. Plasmid DNA and mRNA were deliveredinto HEK293 cells by Lipofectamine 3000, whereas Cas9 RNPs weredelivered with RNAiMAX. As shown in FIG. 21A, the efficiencies of genomemodification were similar among three target loci in DNA, mRNA and Cas9protein-transfected cells.

Next examined was the kinetics of genome cleavage by transfecting cellswith either plasmid DNA, mRNA or Cas9 RNPs, followed by genome cleavageassays and Western Blot analysis of cell lysates. In this study, it wasobserved similar cleavage kinetics between Cas9 delivered as plasmidDNA, mRNA and protein with efficient cleavage seen at 24 hoursplateauing at 48 to 72 hours post-transfection in HEK293 cells (FIG.21B). It was found that the kinetics of Cas9 RNP and mRNA encoded Cas9appearance and turnover inside the transfected cells was quite differentfrom that seen with Cas9 delivered via plasmid DNA. Measuring by WesternBlot (FIG. 21C), it was found that Cas9 protein accumulated over time asexpected in plasmid DNA-transfected cells, whereas the relatively lowexpression of Cas9 in mRNA-transfected cells seemed to peak as early asfour hours post transfection and remained relatively stable forapproximately 44 hours before diminishing. In the Cas9 RNP-transfectedcells, the level of Cas9 protein peaked in four hours or less thenrapidly decreased and was barely detectable in our assay at 48 hours. Asa control, the blot membrane was stripped and re-probed with anti-actinantibody. Similar levels of actin expression were observed among samples(data not shown).

Because of the difference in protein appearance and apparent turnoverrates, it was hypothesized that the off-target cleavage activity forCas9 RNP transfection would be lower than that of plasmid DNAtransfection. This was tested by targeting a locus in the VEGFA genewhich has been identified as having several high activity off-targetsites (Tsai et al., “GUIDE-seq enables genome-wide profiling ofoff-target cleavage by CRISPR-Cas nucleases,” Nat Biotechnol.doi:10.1038/nbt.3117 (2014)) via DNA, mRNA, and Cas9 RNP proteintransfection followed by genome cleavage and locus sequencing analysis.Among the six potential off-target sites that have been studiedpreviously (OT3-1, OT3-2, OT3-4, OT3-9, OT3-17 and OT3-18), only OT3-2and OT3-18 were detected to harbor off-target mutation based on genomecleavage analysis. Further analysis of locus OT3-2 by sequencingindicated that the ratio of indel mutation of OT3-2 over on target inmRNA and Cas9 RNP transfected cells was 2 fold and 2.5 fold lower thanthat in DNA-transfected cells, respectively. The ratio of indel mutationof OT3-18 over on on-target was 1.6 fold and 28 fold lower in mRNA orCas9 RNP-transfected cells respectively than in DNA-transfected cells(FIG. 21D). The on-target editing efficiency increased with an increaseddose of Cas9 RNP, reaching plateau at around 2 μg of Cas9 protein, whilethe off-target modification at the loci examined remained low andconstant (data not shown). Taken together, these data suggest that Cas9delivery as mRNA and pre-complexed protein supports increased genomiccleavage specificity compared with standard DNA plasmid transfection.

Electroporation-Mediated Cas9 Protein Transfection

Many biologically and physiologically relevant cell lines, such aspatient derived iPSC and progenitor cells, are refractory to efficienttransfection by lipid-based reagents. Any improvement in the efficiencyof genome modulation would facilitate isolation of appropriatelyengineered cells for experimentation and therapy so alternate means ofdelivering Cas9 RNPs and Cas9 mRNA/gRNA formulations and their effect onindel generation were explored. Using Jurkat T cells as an initialmodel, the delivery of Cas9 and gRNA plasmid DNA, Cas9 mRNA/gRNAformulations and Cas9 RNPs were compared using microporation (describedin Materials and Methods, data not shown). Our results showed that,compared with plasmid DNA and mRNA deliveries, superior genome editingefficiency was achieved via delivery of Cas9 RNPs with approximately 90%HPRT locus-specific modification under several electroporationconditions (FIG. 22A). In general, Cas9 RNP delivery was more robustthan DNA or mRNA delivery over most of the electroporation conditionstested. The cleavage efficiency was dose-dependent, reaching a maximumat approximately 1.5 μg Cas9 protein and 300 ng gRNA (˜1:1 molar ratio)per transfection. After sequencing the cell pools it was found that 94%of target loci harbored mutations at a cleavage site located at 3 basesupstream of NGG PAM sequence (Supplementary sequencing data). Inagreement with previous work, the majority of mutations were distinctfrom each other with 73% insertion, 18% deletion and 3% basesubstitution. Given the high single-locus cleavage efficiency measuredwith the Cas9 RNP system, the ability to efficiently lesion multiplegenes in a single transfection was testes. Here the capability ofmultiplexing Cas9 RNP transfection at three loci (AAVS1, RelA and HPRT)were examined. After pooling and delivering multiple species of Cas9 RNP(differing only by gRNA target), it was found that the efficiency ofsimultaneous editing of AAVS1/HPRT or AAVS1/RelA/HPRT loci wassignificantly greater at all loci compared with either plasmid or mRNAdelivery of Cas9 (FIGS. 23A and 23C). To gain insight into the molecularlevel of multiplexing, one round of clonal isolation by serial dilutionwas performed. After clonal expansion each of the loci was PCRamplified, followed by DNA cloning and sequencing. In the case of twogene editing, it was found that all of 16 isolated clonal cell linesharbored bi-allelic indel mutations on single AAVS1 loci and 93.7% (15of 16) of clonal cells harbored one allelic indel mutation at the HPRTlocus as the HPRT target was located on the X chromosome of a maleJurkat T cell line. Overall, 93.7% of the clonal cell populationscarried indel mutations on both the AAVS1 and HPRT loci (FIG. 23B). Formultiplexing of three genes, three individual cell transfections andclonal isolation were performed with a total of 53 single cell linesanalyzed. In this experiment, 90% and 65% of the clonal cell linesanalyzed harbored bi-allelic indel mutations at the AAVS1 and RelA locirespectively, whereas 80% of the clonal cells carried indel mutations atthe HPRT locus. Approximately 65% of the clonal cells carried bi-allelicindel mutations on both AAVS1 and RelA loci, whereas 80% and 65% of theclonal cells harbored indel mutations on AAVS1/HPRT loci and RelA/HPRTloci respectively. Overall, 65% of the clonal cell lines harbored indelmutations on all three targets (FIG. 23D). Further, 100% of the Jurkat Tcell clones were edited at least once, suggesting that the transfectionefficiency reached nearly 100%. Taken together, Cas9 RNP delivery viaelectroporation under the conditions used here achieved exceptionallyhigh mutagenesis frequencies. This represents a substantial improvementin Cas9-mediated genome editing and significantly reduces the workloadneeded for clonal isolation by substantially reducing the number ofcells that must be screened in order to identify and isolate the desiredcell line.

Discussion

The ability to easily modulate the sequence specificity of the Cas9nuclease by simply changing the 20 nucleotide targeting sequence of thegRNA offers significant versatility in delivery options over othernucleases that have been utilized for genome editing, such as zincfinger nucleases and TAL effectors. Now, researchers are able to choosefrom cost-effective and rapid design options by formulating the nucleaseas either plasmid DNA, pre-made mRNA or purified protein. The designversatility is enabled by rapid production of the guide RNA component.Until recently, the gRNA was generally produced via cloning of atemplate sequence into a plasmid vector or vectors and expressing theCas9 and gRNA in vivo. Described here is a streamlined protocol wheregRNA design and template construction is facilitated by synthesis of twoshort single stranded oligonucleotides. The oligonucleotides areincorporated into gRNA templates via a short PCR reaction followed byconversion to gRNA by in vitro transcription. Target-specific oligos canbe designed, ordered, and converted to purified gRNA in as little as twodays. On the second day, the gRNA is formulated with either Cas9 mRNA orprotein, and immediately used to transfect cells. The entire processconsists completely of liquid handling and enzymatic reaction steps,which make it amenable to higher throughput gRNA production andtransfection in multi-well plates.

The streamlined gRNA workflow was compared across the three deliveryoptions and found that in general, Cas9/gRNA ribonucleoprotein complexes(Cas9 RNPs) offered superior indel production efficiency in most of thecell lines was used as a test bed. It is currently not clear why Cas9RNP and total RNA formulations perform as they do but a factor could beoverall size of the lipid complexes, the ability of Cas9 protein toprotect the gRNA from cellular degradation, and the elimination ofDNA-based cellular toxicity. In relation to plasmid delivery, Cas9introduced as a Cas9 RNP or mRNA appears in the cell at low butevidently functional levels and is cleared rapidly which could alsoreduce the opportunity for off-target binding and cleavage. The datapresented above suggests that this could be the case but a significantlymore detailed evaluation is needed for confirmation.

Much progress has been made to reduce or eliminate off-target cleavagein CRISPR systems, such as use of paired Cas9 nickases and dimeric ‘deadCas9’ FokI fusions, which has been shown to reduce off-target activityby 50- to 1,500-fold (Tsai et al., “GUIDE-seq enables genome-wideprofiling of off-target cleavage by CRISPR-Cas nucleases,” NatBiotechnol. doi:10.1038/nbt.3117 (2014); Fu et al., “High-frequencyoff-target mutagenesis induced by CRISPR-Cas nucleases in human cells,”Nat. Biotechnol. 31:822-826 (2013)). Perhaps delivery of these tools viaCas9 RNPs would lead to even higher specificity while retaining highactivity levels.

In this work, it was shown that it is possible to multiplex three Cas9RNP species targeting separate loci in Jurkat T cells while achievinghigh levels indel production at all three loci. Further, it was observedhigh rates of biallelic modification at two diploid alleles (AAVS1 andRelA) in these experiments even when also modifying a haploid locus(HPRT) at similarly high levels. Taken together, the high rates ofbiallelic modification in cell populations suggest that employing Cas9RNP delivery would significantly simplify the workflow by facilitatingthe selection of multigene knockout cell lines from a single experiment.

A survey was performed of eleven commonly used mammalian cell linescomparing CRISPR delivery via plasmid, Cas9 mRNA/gRNA, and Cas9 RNP(Table 11.) and found that Cas9 mRNA/gRNA or Cas9 RNPs were superior toplasmid delivery in all cell lines tested. Delivery of these reagentsvia microporation offered the highest target-specific indel productionunder the conditions tested. In all but one case (NHEK cells), Cas9 RNPout performed Cas9 mRNA/gRNA and in human CD34+ cord blood cells, Cas9RNP delivered via microporation was the only method that yielded asignificantly robust editing solution.

TABLE 11 Transfection efficiency in variety of cell lines DNA RNAProtein Cell lines Lipid Elect. Lipid Elect. Lipid Elect. 293FT 49.448.7 70   40.3 51.4 88 U2OS 15.0 50.3 21.4   23.6 ^(#) 18.4   69.5 MouseESCs 30 45 45 20 25 70 Human ESCs 0 8 20 50 0 64 (H9) Human iPSCs 0 2066   31.6 0  87* N2A 65.8 75.7 65.6   80.2 66.3   82.3 Jurkat 0 63 0 420  94* K562 0 45 0 27 0 72 A549 15.0 44.3 23.1   28.7 19.7   65.5 Humankeratin. 0 30 0 50 0 35 (NHEK) Human Cord n/a 5 n/a  0 n/a 24 bloodcells CD34+ Notes: 1) HPRT for human cell lines and Rosa 26 for mousecell lines 2) *confirmed by sequencing 3) ^(#) Cleavage efficiency couldbe increased to 68% when Lipid was added into reaction beforeelectroporation.

Described here is a streamlined approach to the mammalian genomeengineering workflow that takes as few three days to modify mammaliangenomes from CRISPR target design to evaluation of genome editing. Toachieve a high mutagenesis efficiency in hard-to-transfect cells, asystematic approach was used to optimize transfection conditions andcompare delivery of CRISPR editing tools via plasmid DNA, Cas9mRNA/purified guide RNA (gRNA) formulations, and pre-complexed Cas9protein and gRNA ribonucleoproteins (Cas9 RNPs). It was found Cas9mRNA/gRNA and Cas9 RNP performance superior to ‘all-in-one’ plasmid DNAconstructs in the variety of cell lines analyzed in this work. Mostlikely due to the high efficiency of Cas9 RNP delivery, it was possibleto efficiently modify the genome at multiple loci simultaneously,thereby reducing the workload for downstream clonal isolation in schemeswhere more than one gene knock-out is desired. Further, it was foundthat delivery of Cas9 RNPs to cell lines considered hard to transfect(Jurkat, iPSC, CD4+) via electroporation yielded high levels of locusspecific modification.

TABLE 12 Structure of Donor DNA Molecules Oligo Sequence SEQ ID BT1/T8OOCTGGCCCACCCTCGTGACCACCTTCACTFOG 38 GEEACCGGGTGGGAGCACTGGTGGAAGTGGATEO39 3OT1/ OOCTGGCCCACCCTCGTGACCACCTTCACCTACG 40 T8 GCGZECGEECACGGGACCGGGTGGGAGCACTGGTGGAAGT 41 GGATEO 5OT1/OOCGTGCCCTGGCCCACCCTCGTGACCACCTTCA 42 T8 CCTFOGGEEACCGGGTGGGAGCACTGGTGGAAGTGGATGC 43 CGCAOE 3O/T8 OZTCACCTACGGCGZEC 44CFOTGGTGGAAGTGGATEO 45 5O/T8 EZGACCACCTTCACCTFOG 46 GFFGTGGATGCCGCAEO 47BT2/T8 OFCCGGCAAGCTGCCCGTGCCCTGGCCCACCCTC 48 GTGACCACCTTCACCTFOGGZEGCCGTTCGACGGGCACGGGACCGGGTGGGAG 49 CACTGGTGGAAGTGGATEO 3OT/T8OFCCGGCAAGCTGCCCGTGCCCTGGCCCACCCTC 50 GTGACCACCTTCACCTACGGCGZECGFOGTGGTGGCCGTTCGACGGGCACGGGACCGGG 51 TGGGAGCACTGGTGGAAGTGGATEO 5OT2/OZGCACCACCGGCAAGCTGCCCGTGCCCTGGCCC 52 T8 ACCCTCGTGACCACCTTCACCTFOGGZEGCCGTTCGACGGGCACGGGACCGGGTGGGAG 53 CACTGGTGGAAGTGGATGCCGCAOE SSP-OFTCTGCACCACCGGCAAGCTGCCCGTGCCCT 54 GGCCCACCCTCGTGACCACCTTCACCTACGGCGTGCAGTGCTTCGCCCGCTACCCCGACCACFZG DS DNAATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGG 55 TGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGC GAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCC CTGGCCCACCCTCGTGACCACCTTCACCTACGGCGTGCAGTGCTTCGCCCGCTACCCCGACCACATGA AGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAG GACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGA GCTGAAGGGCATCGACTTCAAGGAGG Legend: F= Phosphorothioate-A, O = Phosphorothioate-C, E = Phosphorothioate-G, Z= Phosphorothioate-T Regions of sequence homology are underlined

While the foregoing embodiments have been described in some detail forpurposes of clarity and understanding, it will be clear to one skilledin the art from a reading of this disclosure that various changes inform and detail can be made without departing from the true scope of theembodiments disclosed herein. For example, all the techniques,apparatuses, systems and methods described above can be used in variouscombinations.

1. A method for producing a nucleic acid molecule, the method comprisingperforming polymerase chain reaction (PCR) in a reaction mixturecontaining (i) a double-stranded nucleic acid segment and (ii) at leastone oligonucleotide capable of hybridizing to nucleic acid at oneterminus of the double-stranded nucleic acid segment, wherein thenucleic acid molecule is produced by the PCR reaction, and wherein theproduct nucleic acid molecule contains at or near one terminus apromoter suitable for in vitro transcription.
 2. The method of claim 1,wherein the nucleic acid molecule is produced by the PCR reactionencodes an RNA molecule from 35 to 150 nucleotides in length.
 3. Themethod of claim 1, wherein the nucleic acid molecule is produced by thePCR reaction is from 70 to 150 base pairs in length.
 4. The method ofclaim 1, wherein the nucleic acid molecule is produced by the PCRreaction encodes an RNA molecule with at least two hairpin turns.
 5. Themethod of claim 1, wherein the nucleic acid molecule is produced by thePCR reaction encodes a CRISPR RNA.
 6. The method of claim 5, wherein thenucleic acid molecule is produced by the PCR reaction encodes a guideRNA. 7.-26. (canceled)
 27. A method for gene editing at a target locusin a cell, the method comprising: (A) introducing into the cell a CRISPRprotein and a CRISPR RNA molecule, wherein the CRISPR RNA has a regionof sequence complementarity of at least 10 base pairs to the targetlocus, and (B) introducing into the cell a donor DNA molecule, whereinthe CRISPR protein and the CRISPR RNA molecule is introduced into thecell first before the DNA molecule is introduced into the cell, andwherein the CRISPR RNA molecules has two regions of sequencecomplementarity with a target locus in the cell.
 28. The method of claim27, wherein the CRISPR protein and the CRISPR RNA molecules comprise aCas9/gRNA complex.
 29. The method of claim 27, wherein the CRISPRprotein and the CRISPR RNA molecules are introduced into the cell byelectroporation.
 30. The method of claim 27, wherein the donor DNAmolecule is single-stranded.
 31. The method of claim 27, wherein thedonor DNA molecule is has two regions of sequence complementarity to thetarget locus in the cell and an intervening region.
 32. The method ofclaim 31, wherein the donor DNA molecule comprises two regions ofsequence complementarity to the target locus in the cell that areindependently between 30 and 50 nucleotides in length.
 33. A method forintroducing different CRISPR RNA molecules into cells, the methodcomprising contacting different samples of the cells with differentCRISPR RNA molecules, wherein the different CRISPR RNA molecules arecombined with a CRISPR protein prior to contacting the different samplesof the cells, and wherein, prior to introduction in the cells, theCRISPR protein is stored under conditions where the CRISPR protein willretain at least 75% activity after six months at 4° C.
 34. The method ofclaim 33, wherein the CRISPR protein is stored with a transfectionreagent, a cell culture medium, or a CRISPR RNA molecule.
 35. The methodof claim 33, wherein the CRISPR protein is stored in separate locations.36. The method of claim 35, wherein the separate locations are wells ofa multi-well plate.
 37. The method of claim 35, wherein the differentCRISPR RNA molecules are individually added to the separate locationswhere the CRISPR protein is stored.